Temperature Dependence of the Vibrational Spectrum of Porphycene: A Qualitative Failure of Classical-Nuclei Molecular Dynamics

The temperature dependence of vibrational spectra can provide information about structural changes of a system and also serve as a probe to identify different vibrational mode couplings. Fully anharmonic temperature-dependent calculations of these quantities are challenging due to the cost associated with statistically converging trajectory-based methods, especially when ac-counting for nuclear quantum effects. Here, we train a high-dimensional neural network potential energy surface for the porphycene molecule based on data generated with DFT-B3LYP, including pairwise van der Waals interactions. In addition, we ﬁt a kernel ridge regression model for the molecular dipole moment surface. The combination of this machinery with thermostatted path integral molecular dynamics (TRPMD) allows us to obtain well-converged, full-dimensional, fully-anharmonic vibrational spectra including nuclear quantum effects, without sacriﬁcing the ﬁrst-principles quality of the potential-energy surface or the dipole surface. Within this framework, we investigate the temperature and isotopologue dependence of the high-frequency vibrational ﬁngerprints of porphycene. While classical-nuclei dynamics predicts a red shift of the vibrations encompassing the NH and CH stretches, TRPMD predicts a strong blue shift in the NH-stretch region and a smaller one in the CH-stretch region. We explain this behavior by analyzing the modulation of the effective potential with temperature, which arises from vibrational coupling between quasi-classical thermally activated modes and high-frequency quantized modes. moment of the k -th replica. The estimator entering the vibrational density of states, due to the linearity of the velocity operator, is simply the velocity of the ring-polymer centroid. TRPMD is known to yield a good approximation to quantum time correlation functions that involve operators with a linear or almost linear dependence on positions. It does not capture nuclear quantum-mechanical coherence, that can become more relevant at lower temperatures. We do not expect these effects to play a determinant role in the temperature range considered in the present study.

Please note that technical editing may introduce minor changes to the text and/or graphics, which may alter content. The journal's standard Terms & Conditions and the Ethical guidelines still apply. In no event shall the Royal Society of Chemistry be held responsible for any errors or omissions in this Accepted Manuscript or any consequences arising from the use of any information it contains.
In this paper, we report the training of a high-dimensional neural network potential 23,24 (HDNNP) based on data from densityfunctional theory (DFT) with a hybrid exchange correlation functional (B3LYP [25][26][27][28] ) and including pairwise van der Waals interactions 29 . We complement this potential by a Kernel Ridge Regression (KRR) model trained on the same ab initio calculations to reproduce the dipole moment of this molecule. This setup allows the classical-nuclei and path-integral-based molecular dynamics simulations to effortlessly reach the nanosecond time scale with ab initio accuracy. We address standard porphycene (Pc-d0), the isotopologue where the inner-cage hydrogens have been substituted by deuterium (Pc-d2), and the isotopologue where only the outer-cage hydrogens have been substituted by deuterium (Pc-d12) (see Figure 1). These isotopologues have been the subject of previous experimental and theoretical work 3,[30][31][32][33] . Based on the linear response time-correlation formalism, we compute the vibrational density of states and infrared spectra in a wide temperature range for the different isotopologues. We analyze their similarities and differences and address the relevance of including anharmonic contributions and nuclear-quantum effects in their full-dimensional nature, especially focusing on the NH-stretch signal which is the most puzzling feature of the vibrational spectrum of this molecule.
This manuscript is organized as follows: In Sec. 2.1 we present the details of the DFT calculations used to generate the data on which the HDNNP and the KRR are based on. In Sec. 2.2, we provide a short theoretical background on the observables that we are going to compute. The training of the HDNNP potential and the KRR model, with the respective data set description and the validation tests are described in Sec. 2.3 and 2.4 followed by the simulations protocol in Sec 2.5. Having established a reliable description of our system we discuss our results for the temperature dependence of the vibrational spectra and the physical origins of the observed behaviors in Sec. 3. Finally, in Sec. 4, we close the manuscript summarizing our most important findings.

Electronic structure
The density-functional theory calculations were performed with the FHI-aims all-electron code 34 . According to previous benchmarks performed in Ref. 2 , the B3LYP functional including pairwise van der Waals (vdW) corrections yields a good agreement to benchmarkquality CCSD(T) reference data for stationary points on the PES, barrier heights and geometrical properties. In this manuscript, we used most of the data generated from the ab initio molecular dynamics trajectories in Ref. 2 and augmented it by a few points, as detailed below, to train the HDNNP. We have therefore consistently used the light settings of the FHI-aims code package for all new calculations. For stationary points on the potential energy surface, we observed that light settings yield differences below 5 meV in relative energies and 0.005 Å in atomic distances (see Ref. 2 ) with respect to tight settings. An assessment of the impact of different numerical grids and basis sets settings on the vibrational spectrum of porphycene is shown in Fig. S1 in the SI. We observe differences in peak positions of at most 20 cm −1 and minimal differences in the harmonic peak intensities.

IR spectrum and vibrational density of states
The IR spectrum and vibrational density of states presented in this paper were calculated within the linear-response time-correlation formalism. Within this formalism, based on Fermi's golden rule 35 , the IR adsorption cross section σ (ω) is proportional to where ω is the frequency, µ is the molecular dipole moment, andC µ µ (ω) is the Fourier transform of the Kubo-transformed dipole-dipole time-correlation function.
The vibrational density of states is given by 36,37 vDOS where the index i runs through all 3N nuclear degrees of freedom in the system, and v i is the velocity component of each degree of freedom. We note that the VDOS given in Eq. 2 does not contain cross-terms involving the correlation of the velocities between different degrees of freedom, which instead contribute to the dipole spectrum. * The definition in Eq. 2 is often connected to the incoherent dynamic structure factor in neutron scattering experiments 36,37 and obeys well known sum rules 38,39 . The integral of the spectrum is proportional to the temperature 39 , such that in order to compare VDOS spectra at different temperatures, we have divided each one by their respective target temperature.
The time evolution of the dipole moment and the positions (and their derivatives) has been computed within the Born-Oppenheimer approximation with classical nuclei molecular dynamics (MD) and thermostatted ring-polymer molecular dynamics (TRPMD) 40 . TRPMD is an approximation to nuclear quantum dynamics based on classical dynamics in the extended phase space of the ring polymer 40,41 . The methodology is very similar to ring-polymer molecular dynamics 41 , with the difference that the internal modes of the ring polymer are attached to thermostats, while the centroid follows Hamiltonian dynamics. Its connection to quantum dynamics has been shown as the short-time approximation to Matsubara dynamics [42][43][44] . There is a freedom in the choice of the thermostatting procedure 40 , and it has been recently shown that generalized Langevin equation (GLE) thermostats can be tailored to yield optimal spectra 45 . Here, we used the parameters proposed in Ref. 45 for the GLE thermostats, which mitigate the spurious broadening of the TRPMD vibrational lineshape. The estimator for the dipole was computed as µ P = ∑ P k=1 µ k /P, where k runs over the P ring-polymer beads and µ k is the dipole moment of the k-th replica. The estimator entering the vibrational density of states, due to the linearity of the velocity operator, is simply the velocity of the ring-polymer centroid. TRPMD is known to yield a good approximation to quantum time correlation functions that involve operators with a linear or almost linear dependence on positions. It does not capture nuclear quantum-mechanical coherence, that can become more relevant at lower temperatures. We do not expect these effects to play a determinant role in the temperature range considered in the present study.

Potential Energy Surface from High-Dimensional Neural Networks
High-dimensional neural network potentials as proposed by Behler and Parrinello 23 allow to represent complex potential energy surfaces of very large systems containing thousands of atoms and all their associated degrees of freedom. In this approach the total energy E of the system is constructed as a sum of atomic energies E i , which depend on the local atomic environments up to a cutoff radius R c , which has typically a value between 6 and 10 Å. The positions of all neighboring atoms inside these cutoff spheres are described by vectors of many-body atom-centered symmetry functions 24 , which incorporate the required translational, rotational and permutational invariance of the PES. These symmetry function vectors serve as inputs for individual atomic neural networks (NNs) yielding the E i . For each element present in the system there is one atomic NN, i.e. the architectures and parameters of the atomic neural networks for a given chemical element are constrained to be the same, which is replicated as many times as there are atoms of the respective element. The parameters of the atomic NNs are determined in an iterative training process using known reference energies and forces from electronic structure calculations. No energy partitioning is required as the atomic energies are optimized to yield the correct total energies, which in the present work has been done using the RuNNer code [46][47][48] . HDNNP energies and forces can be evaluated at about the same computational cost as empirical potentials, enabling the generation of long (and many) trajectories with minimal loss of accuracy. For a comprehensive review on the methodology and construction of HDNNPs we refer the readers to Refs. 46,47 .

Data set, neural network architecture and global accuracy.
The HDNNP has been parameterized starting from an initial data set of structures that was obtained from three different simulation sources as shown in Table 1. The first, relatively small, set consists of single point evaluations at stationary points of the PES and their vicinity. The second and third sets, which represent almost 90 % of the final data set, came from ab initio MD and PIMD simulations at 150 and 290 K that were performed for the study presented in Ref. 2 . Because we wanted to extend the range of validity for the HDNNP to higher temperatures, where different regions of the conformational space can be reached, a fourth set containing structures sampled at 500, 600, and 700 K was included during the training to avoid any type of extrapolation. Finally, to augment the number of data points near the saddle points, 2000 additional structures from an umbrella sampling simulation were added. In total, these two additional sets computed only for this work represent less than 10 % of the total data. The data for all 5 different sets was computed with exactly the same numerical settings and level of theory.
Set # Description Structures 1 Stationary points set 244 2 Classical MD set 11033 3 Path integral MD set 80665 4 High temperature set 7795 5 Umbrella sampling set 2000 Table 1 Description of the data sets available for the construction of the HDNNP. Data sets 1, 2 and 3 were available from a previous work 2 , while data sets 4 and 5 were produced for this study. High temperature simulations refer to additional structures obtained from simulations at 500, 600 and 700 K.  Their overall accuracy is commonly evaluated in terms of the energy and force root mean square error (RMSE) for both training and test sets. Overall, we have been able to reach the typical RMSEs of less than ∼ 1 meV/atom for the energy and less than ∼ 100 meV/bohr for the force 47 , which are known to yield very good results. We show in Figure S2 in the SI scatter plots comparing DFT energies and HDNNP predictions for points spanning up to 1 eV from the global minimum and append to the SI the numerical parameters for all symmetry functions of all NNs.

Validations tests
Since the overall accuracy of all the HDNNPs is comparable, we present the following results as average values calculated from the several trained HDNNPs. In the case of observables involving quantum-nuclear statistics, we exclude the HDNNPs C, D and E because they were not trained explicitly with path integral MD data. We start our validations tests looking at the stationary points generated by the HDNNP. For this purpose, instead of using the geometries given by DFT, which are not necessarily stationary points on the new generated PES, we performed new geometry optimizations for each HDNNP. The results are shown in Table 3 and show an overall overestimation of the energies of ∼ 5% with respect to the reference values even though the optimized geometries with the HDNNPs are almost identical (See Tab. SII in SI ). More interesting are the NH stretch frequencies at those stationary points reported in Table 4. The degree of accuracy observed is remarkable, given that we have not explicitly included any Hessian information during the training procedure.
290 308 (9)  A considerably more challenging task is the evaluation of the accuracy of the HDNNP on finite-temperature properties of interest. We were in the favorable situation where we had many pre-computed DFT molecular dynamics trajectories at our disposal so we could compare the predictions of the HDNNP and DFT for such properties directly. In section II.C and II.D of the SI we show these validation tests for quantum-nuclear potential-energy probability densities, VDOS and IR spectra. We find in general that the HDNNP is very accurate for all of these quantities. Small discrepancies appear between ∼ 600 cm −1 and ∼ 1250 cm −1 in the vibrational spectra, and we can connect them to inaccuracies either in the HDNNP potential or in the KRR model for the dipole (described below). Since in this work we will not be focusing in detail on this frequency region, we consider the current setup to be accurate enough for our purposes.

Dipole surface from Kernel Ridge Regression
Although it is possible to incorporate an approximation to the dipole moment within the HDNN 49 , we here chose to use a KRR model 50 with Gaussian kernels in order to compute the molecular dipole moment of a particular molecular structure. This is a non-linear regression where an l2-norm regularization is considered in the loss-function, which is then minimized. This method is formally equivalent to a Gaussian Process Regression 51 , but the hyperparameter optimization procedure can differ.
Here we do not make any assumption on the analytical form of the dipole moment, and model each Cartesian component µ i separately by where the index j runs over the number of points in the training set N T , w i j are the weights, K(X , X j ) = e −(X −X j ) 2 /2σ 2 is the kernel with σ being an hyper-parameter which controls the degree of correlation between training points. X is a vector containing the representation of a particular molecular structure.
By equating the derivative of the loss function to zero, the weights for each dipole component are straightforwardly determined by where is the kernel covariance matrix I I I is the N T × N T identity matrix and the hyperparameter λ sets the strength of the regularization. Both hyperparameters σ and λ are not determined by the previous equation. We determine these parameters through a deterministic grid search, as described below.
The efficiency of most KRR models depends on the representation X that enters the model. Many groups have proposed strategies to find optimal representation of molecular geometries that can greatly reduce the size of the training set necessary to reach acceptable accuracy 18,52,53 . Symmetry considerations that allow the extension of this model to the prediction of full tensorial quantities of different ranks have also been recently addressed 54 . However, for this work, a very large quantity of data was already available from previous simulations and the system considered here is a relatively rigid molecule. Therefore, we could simply align all geometries and dipole moments to a reference frame and use the Cartesian components as a representation (i.e. no representation) to predict each component of the dipole moment separately. The alignment was performed using the following expression y y y = R R R(y y y)y y y (6) µ µ µ (y y y ) = R R R(y y y)µ µ µ(y y y) where y y y and y y y are the 3N Cartesian coordinates vector of the initial and aligned geometry respectively, N the number of atoms in the molecule and µ µ µ and µ µ µ their corresponding dipole moments. The 3 × 3 unitary R R R matrix was obtained with the Kabsch algorithm 55 and has the property of minimizing the root mean squared deviation (RMSD) of the Cartesian coordinates with respect to the reference frame. We have excluded all hydrogen atoms from this procedure. We used a total of ∼ 9500 data points of which 90% was selected as part of training and the remaining 10% was assigned to the test set. We performed a random selection 5 times in order to obtain statistically meaningful results for the performance of our model. For each division of the data, the hyperparameters σ and λ were determined using the QML package 56 by a grid search minimization of the root mean square error (RMSE) on the test set (see Table 5). As shown in Table 5, the parameters are very similar for all three components, and using the same values for all would result in a minimal difference in the errors.
In order to describe the accuracy of the dipole moment fit, we report in Table 5 the coefficient of determination, r 2 , and the RMSE normalized by the standard deviation (STD) ε = 100 × RMSE/STD. ε has the advantage of being more sensitive to outliers, in comparison to the mean absolute error, and also allows quantities of different nature to be compared to one another.

Hyperparameters
Training set Test set  Although the errors on the test set are somewhat large (in percentage), they are of a comparable magnitude to what was reported in previous work, that nevertheless could reach the same accuracy with a lesser amount of data and more sophisticated approaches 54,57 . We observe that the models above (with such errors), actually lead to a very good prediction (with smaller errors) of the IR intensities, as shown in Fig. S6 in the SI. This effect has also been oberved and discussed in Ref. 58 for Raman intensities.

Simulation protocol
All trajectories were computed with HDNNP B using the LAMMPS software 59 and its interface with the RuNNer code 60 . We then used the socket interface between LAMMPS and the i-PI code 61,62 in order to communicate energy and forces to perform classical-nuclei and path-integral molecular dynamics. In order to obtain the TRPMD VDOS, we ran 10 different imaginary path-integral trajectories of 110 ps per temperature, with a time-step of 0.25 fs. The first 10 ps of these runs were discarded as equilibration. We launched a 10 ps long TRPMD trajectory each 10 ps after that. This represents a total amount 1 ns of statistical sampling per temperature and per isotopologue. In table 6 we report the number of replicas used for the TRPMD simulations at each temperature.  For classical nuclei simulations, we followed a similar strategy, with the following few differences: (i) Since within classical mechanics any time independent equilibrium property is mass independent, we could use the same classical-nuclei NVT trajectories for different isotopologues; (ii) we employed a time step of 0.5 fs and (iii) started NVE trajectories from the thermalization run.
In all cases the dipole-moments were obtained a posteriori with the KRR model described above, and were calculated at an interval of 2.5 fs. Average nitrogen-nitrogen (a) and nitrogen-hydrogen/deuterium (b) distances at different temperatures obtained from classical MD (red), and PIMD for Pc-d0 (black) and Pc-d2 (blue) simulations. c) shows the average carbon-hydrogen/deuterium distances from classical MD (red), and PIMD for Pc-d0 (black) and Pc-d12 (blue). The circles in c) refer to the eight carbon-hydrogen/deuterium bonds that are part of the pyrrole rings while the diamonds refer to remaining ones. In all cases the statistical error is 0.001 Å or smaller. Dashed lines correspond to linear fits to the data, with the slope S reported in the figure. A value of S = 0 means that the observed slope is smaller than the statistical error.

Results and Discussion
Geometrical equilibrium properties often have a decisive impact in time dependent properties. In porphycene, a particular geometrical property of interest is the equilibrium N-N distance d(NN) at a given temperature. This is because the dynamics of the hydrogens within the cage is strongly coupled to low frequency modes which modulate the cage size and therefore also d(NN) 63,64 . In Fig. 2(a) we show the average d(NN) as a function of temperature, for the Pc-d0 and Pc-d2 isotopologues, considering quantum and classical nuclei. For classical nuclei simulations, both istopologues (must) yield the same equilibrium geometrical properties. In all cases, the cage size, reflected by d(NN), increases linearly with temperature. In the temperature range studied, d(NN) of Pc-d0 is smaller than of Pc-d2 and both are smaller than the classical nuclei d (NN). This geometrical isotope effect (GIE) is the so-called secondary-GIE or the Ubbelohde effect [65][66][67] . We obtain an almost temperature-independent value of this secondary GIE of 0.016 Å, which compares very well with the experimental inferred one of 0.02 Å 32 . As pointed out by Ubbelohde more than 60 years ago, the "structures may generally be expected to be closer for H than for D-bonds, because of differences in zero-point vibrations" 65 , even though inverse Ubbelohde effects are possible for weak H-bonds, and have been observed 67,68 . In porphycene, where strong directional H-bonds are present, the nuclear-quantum fluctuations along the anharmonic N-H bond increase the N-H bond length d(NH). As shown in Figure  2(b) this effect is greater for Pc-d0 than for Pc-d2, which is the so-called primary isotope effect. In both cases, this bond elongation strengthens the H-bond and consequently reduces d(NN), inducing also a correlation between d(NN) and d(NH/D), shown in Fig. 3(a). The similarity of the values of the linear slope S of d(NN) for Pc-d0 and Pc-d2 above 100 K mean that the cage expansion can be treated classically in this temperature range. At 50 K we observe a deviation from the linear slope for Pc-d0, which is consistent with the observation that the lowest cage vibration mode lies at 65 cm −1 (≈90 K).   The positive sign of S can also be rationalized by analyzing the 1-D PES cut along the d(NN) coordinate shown in Fig. 3(b). The cage modulation is quite anharmonic and the potential is softer towards larger d(NN), such that when more thermal energy is available the average d(NN) will increase. The fact that the cage is much smaller in the quantum case than in the classical case also promotes the population of quasi-cis structures that present a permanent dipole, as has been discussed in Ref. 2 . For the situations above we found that the analysis for Pc-d12 is the same as for Pc-d0, such that we do not show this data.
For d(CH/D) and d(NH/D) in Figs. 2(b) and 2(c) we observe no temperature dependence when treating the nuclei quantum mechanically. This is expected, since the frequencies of the corresponding vibrational modes (>1900 cm −1 , 2730 K) mean that more than 99.9 % of the population stays in the vibrational ground state. In contrast, for classical nuclei the bond elongation increases with temperature even though the coupling between d(CH/D)/d(NH/D) and d(NN) is rather small, as shown in Fig. 3(a). The bond length increase is thus due to the temperature-dependent increase in population of this mode when treated classically and its anharmonic profile. The (quantum) temperature dependence of these geometrical properties for the deuterated isotopologues follow closely the ones for standard porphycene, but the deuterated bonds are always smaller than their hydrogenated counterparts.

Temperature-dependent vibrational spectra
Having analyzed the geometrical properties of porphycene and its isotopologues, we turn our attention to vibrational spectra. In Fig.  4 we show the TRPMD VDOS for the three isotopologues in the NH (upper panels) and CH (bottom panels) stretch regions. The full frequency range is reported in the SI, Fig. S10. For Pc-d0 and Pc-d12 (panels a and c), the position and temperature dependence of the NH stretch peak around 2600 cm −1 is similar. Importantly, the broadening increases with temperature and it presents an unexpected blue shift of 150 cm −1 when the temperature is increased from 50 to 400 K. For Pc-d2 (panel b), the ND stretch peak appears at around 2050 cm −1 . This value is 200 cm −1 above the trivial harmonic mass-scaled frequency, which highlights the anharmonic nature of this vibration. In this case the peak is narrower, and while both the blue shift and the broadening with increasing temperature are still present, their magnitude is smaller if compared to Pc-d0.
The CH/D stretch peaks present, for all temperatures and isotopologues, a bimodal distribution emerging from the two types of CH bonds in this molecule: the ones belonging to the pyrrole rings and the ones which are connected to the 'bridge' positions. For the Pc-d0 and Pc-d2 cases (panels a and b), the CH peaks show a modest blue shift of ∼ 20 cm −1 and 10 cm −1 when the temperature is increased. On the contrary, the blue shift is not observed for Pc-d12 (panel c).
The classical-nuclei VDOS at the same vibrational regions and temperatures are shown in Fig. 5 for Pc-d0 (panel a) and Pc-d2 (panel b). The full frequency range is reported in the SI, Fig. S11. It is clear that, contrary to the TRPMD spectra, the NH and ND-stretch peaks slightly red-shift (20 cm −1 ) with increasing temperature. Additionaly, these vibrations are situated much closer to the values predicted by the harmonic approximation in this case: 2950 and 2180 cm −1 for Pc-d0 and Pc-d2, respectively. The CH-stretch peaks also consistently red-shift with increasing temperature, and are also situated very close to the harmonic-approximation value of about 3175 and 3250 cm −1 .
Clearly, the nature of the high frequency vibrational peaks is strongly affected by nuclear quantum fluctuations, altering both the peak positions and their temperature dependence. In the next section, we explore the physical origins of these effects.

Physical Origin of Classical and Quantum Temperature Dependence
We now rationalize the stark differences observed in the temperature dependence of the vibrational NH/D and CH/D stretch regions when considering TRPMD and classical nuclei dynamics. As discussed previously, the increase of the temperature induces an expansion of the porphycene cage. Assuming that an adiabatic separation of the high and low frequency modes is valid, we can ask ourselves Faraday Discussions   how the local potential experienced by the NH and CH modes change with increasing temperature, due to the change in d(NN). In Fig. 6(a) and (b) we show the dependency of the harmonic NH/D and CH/D frequencies with the thermally accessible d(NN) values. We obtained these results by constraining d (NN) at particular values, relaxing all other degrees of freedom, and then performing a harmonic vibrational analysis. We observe that the harmonic NH stretch frequency (ν NH ) spans a range of 1000 cm −1 and blue-shifts with increasing d(NN), evidencing the strong coupling of the NH stretch vibration with modes that modulate the cage size. The ND stretch shows the same behavior, but the shift is less pronounced. The CH stretch frequency also blue shifts with increasing d(NN), but only by less than 40 cm −1 on the same d(NN) interval and the CD frequency is almost constant.
The simplest model to understand the physics of the NH-stretch potential energy, where exact quantum dynamics can be solved numerically, is a 1D double-well potential that changes with temperature, as shown in Figure 7. The potential energy surfaces shown there were obtained by calculating the minimum-energy pathway (MEP) projected on the hydrogen-transfer coordinate (see caption of Fig. 7 for a definition of this coordinate). We calculated the MEP for two constrained d(NN) that are representative of the lowest and highest temperature in this study. The MEP between these two constrained minima was obtained with the nudged elastic band method 69 and it was extended by a harmonic potential at low and high d(NN), consistent with the corresponding NH frequency, in order to model the repulsive parts of the well.
In the classical-nuclei picture, temperature has two important effects. On the one hand, it induces a cage expansion with increasing temperature, and on the other hand, it also increases the thermal population of the NH vibrational well, as pictorially shown in Figure  7. The first effect makes the walls of the potential increase more steeply with increasing temperature, thus increasing the curvature of the well (blue-shift), as evidenced in 6(a). The second effect causes the nuclei to explore more anharmonic parts of the well at higher energies, thus causing a red-shift. The final result is a competition between both effects and, for the classical case, the full-dimensional VDOS ( Figure 5) shows that the second one wins at certain temperatures, giving a net red-shift that becomes smaller (and eventually could turn into a blue shift) as the temperature is increased.
In quantum mechanics, however, the situation is quite different. The cage modulation with increasing temperature is similar to the classical-nuclei case. This can be easily understood since the energies of these modes lie at around 200 cm −1 (287 K) such that the thermal energy is comparable to the energy level spacing. On the other hand, the NH mode is still completely quantized up to 400 K and its thermal activation is negligible. The ZPE connected to this mode ensures that at all temperatures studied here, the anharmonic parts of the potential are assessed beyond what could be assessed by the classical nuclei simulations, thus explaining the overall redshift when comparing quantum simulations to the classical ones. However, the insignificant thermal activation in this temperature interval makes this vibration solely sensitive to the modulation of the cage-mode with temperature, such that there is no competition Faraday Discussions between two effects and the result can only be a blue shift. This effect is also exemplified in Figure 7. We have calculated the exact quantum mechanical first and second vibrational states connected to the 1D double well potentials presented in the figure, following the algorithm of Ref. 70 , where the MEP was fitted by a fourth order polynomial in the [-0.5 Å,+0.5 Å] interval in order to obtain an analytical form of the potential. We show an average of the tunneling-splitted levels, since these cannot be captured by TRPMD and would be smaller than the temperature broadening at the temperatures studied here. The less-pronounced effect for ND can also be understood within this model, in that the differences between the first vibrational excited state in the two potentials would be smaller.
The 1D potentials shown in Fig. 7 is useful to understand the physics of this problem and the differences between classical and quantum treatment of vibrational degrees of freedom when vibrational coupling between low and high energy modes is present. It also shows that the classical dynamics in the extended phase space of the ring polymer of TRPMD is successful in capturing the main quantum effect that would dictate the temperature dependence of this vibrational peak in full quantum dynamics. However, we find that it is not possible to obtain quantitative results for the observed temperature-dependence of the vibrational frequencies with this simple model. This once more highlights the inherently multi-dimensional nature of this problem.
Finally, for the CH vibrations we observed a much less pronounced blue-shift with increasing temperature. In this case, the CH potential is better described by a Morse potential and perturbative treatments could give reasonable results. We have thus performed second order vibrational perturbation calculations 71,72 (VPT2). They show important couplings between the CH stretch and the CH bending modes. The results reproduce the average peak positions (see Table S4 in the SI) that we observe in the TRPMD simulations, and these are all red-shifted with respect to the harmonic estimation. However, also this coupling does not immediately explain the blue-shift with increasig temperature, since the bending modes have frequencies starting at 800 cm −1 , which correspond only to a 5% thermal population at the highest temperature. We could not find any clear direct coupling between cage modes and the CH stretch, so it is less clear why and how the expansion of the cage modifies the CH-stretch potential. Nevertheless, since the blue shift with increasing d (NN) is very tenuous as shown in Fig. 6(b), for classical nuclei, increasing the thermal population of this mode results in a very pronounced net red shift. For quantum nuclei, again, there is no competition and we only observe a very small blue shift, which is ∼20 times smaller than in the NH case.  In order to provide a point of reference for the temperature dependence of vibrational quantities that can be easily accessed in experiment, we report the temperature dependent IR spectra of Pc-d0 and Pc-d2 above 1600 cm −1 in Fig. 8, obtained from TRPMD simulations. The full frequency range and the corresponding results obtained from classical nuclei simulations are reported in the ( Fig.  S7 and S8). One point to notice is that compared to full DFT calculations (see SI and Ref. 2 ), the IR spectra we obtain with the HDNNP and KRR combination seem to lose some of the fine features present at the combination-band region just below 2000 cm −1 for Pc-d0, and the fine structure of the NH stretch peak is not so pronounced both in the classical nuclei and the TRPMD simulations. Since both of these features stem from presumably weak mode coupling, we find it plausible that small remaining errors in the models for the forces and in the dipole surface wash out these weak correlations.

Infrared Spectra
As mentioned above, in contrast to the VDOS, the IR spectra contain cross terms involving different nuclei and dipole selection rules. Therefore, lineshapes between VDOS and IR spectra are different and it is not obligatory that trends observed in the VDOS are exactly reflected in the IR spectra. Indeed, the lineshapes in Fig. 8 are different to the VDOS, but the observed temperature dependence of the vibrational peaks in this region is very similar to what was discussed for the VDOS. Specifically, Pc-d0 presents a blue shift of 100 cm −1 and 20 cm −1 for NH and CH stretch peak maxima respectively. For Pc-d2 the CH blue shift remains comparable (15 cm −1 ) and the NH one is only 40 cm −1 . Both the peak positions and the blue-shifts are larger than what would be predicted by the trivial harmonic isotope √ 2 scaling factor, highlighting once more the anharmonic nature of this problem. Finally, we comment on the region of the IR spectrum situated between 1750 and 2000 cm −1 of Pc-d0. The modes in this region are absent in the harmonic spectrum and they are thermally activated in a similar manner (red-shift with increasing temperature) for quantum and classical nuclei. One can thus infer that this band is a combination or overtone of lower-frequency vibrations, as was previously discussed by Gawinkowski and coworkers in Ref. 3 . The red-shift of 100 cm −1 we observe in the TRPMD spectrum is the largest in the whole spectrum (including lower frequencies) and we find the red-shifts of the bands situated between 800 and 1000 cm −1 to be more consistent with the combined shift, thus pointing to overtones. We could not further analyze our results at this point in order confirm this interpretation.

Concluding remarks
In this work we have studied the temperature dependence of structural and vibrational properties of the gas-phase porphycene molecule and its isotopologues, namely Pc-d0, Pc-d2 and Pc-d12, in temperatures ranging from 50 to 400 K. We employed high-dimensional neural-network potentials trained on B3LYP+vdW data to calculate energy and forces, augmented by a kernel ridge regression model for the dipole surface. We took nuclear quantum effects into account through imaginary-time path integral and thermostated ring polymer molecular dynamics.
We could explain the primary and secondary geometric isotope effects related to the N-N and N-H/D distances, and found good agreement with experiment. The strong hydrogen bonds within the cage are strengthened by nuclear quantum effects, such that the N-N distances are larger for Pc-d2 than for Pc-d0, while the N-H/D distances are smaller for Pc-d2 than for Pc-d0. In all cases the cage of porphycene expands with increasing temperature.
Regarding vibrational properties of Pc-d0, we find that anharmonic vibrational spectra derived from classical-nuclei molecular dynamics show a red-shift of both NH and CH stretch regions with increasing temperature. Approximate quantum dynamics (TRPMD) instead predicts a pronounced blue-shift (150 cm −1 ) in the NH stretch region with increasing temperature and a small blue-shift in the CH stretch region. For Pc-d2, the ND stretch shows a less pronounced blue-shift and for Pc-d12 the CD stretch region remains unchanged throughout this temperature range.
The temperature behavior of the NH(ND) stretch regions is a consequence of vibrational coupling between quasi-classical thermally activated cage expansion/contraction modes and high-frequency quantized modes. We could understand the underlying physics of the problem by analyzing a 1D double-well model where the distance (and the barrier height) between the two wells increases with temperature. In classical mechanics there is a competition between the increasing thermal population of the well and the increasing curvature of the potential as the temperature increases. The first causes a red-shift due to the anharmonicity of the potential, while the second causes a blue-shift due to the effective decrease of anharmonicity at the bottom of the well. In quanutm mechanics, at all temperatures studied here, the thermal population of excited vibrational states is completely negligible, such that only the modulation of the potential with temperature plays a role and the result is a pronounced blue shift. The approximate quantum dynamics of TRPMD are able to grasp this effect, because ZPE is correctly described at the different temperatures and the linear-response transition probabilities are only sensitive to the stiffening in the effective curvature of the potential with increasing temperature. Even though the 1D model in this case is useful to understand the underlying physics, no quantitative predictions could be achieved with it, highlighting the inherent high-dimensionality of the problem.
For the CH/D stretch regions the direct coupling between low and high frequency modes is less clear, but we observe the that the cage expansion is correlated to a blue shift with increasing temperature. We note that the (quantum) blue-shift with increasing temperature observed for this isolated gas-phase molecule seems not to have the same origin as the ones observed in condensed phase systems 11,13,73 , where it is understood that the shifts are due to the weakening of intermolecular H-bonds with increasing temperature.
In summary, we have shown that the porphycene molecule is a clear example where considering nuclei as classical particles makes the theory inaccurate and completely removes its predictive power for a range of geometrical and vibrational properties. On the other hand, path-integral based approximations to quantum dynamics, which conserve the quantum Boltzmann distribution and therefore are free from ZPE leakage, are able to grasp the correct underlying physics in this system. In fact, even though we treated an isolated molecule here, the large amount of (anharmonic) degrees of freedom and the temperature range considered washes out most dynamical quantum coherence effects, which presents an ideal ground for such methods. We have also demonstrated that when a large amount of data from ab initio simulations is available, the effort to train high dimensional neural network potentials and kernel ridge regression models that accurately reproduce energy, forces and dipoles can be minimal. This makes it possible to address new physical problems that would be too cost-intensive with accurate DFT potentials -in particular, problems that take the quantum nature of nuclei into account. Open access data repositories that can handle, specifically, data related to ab initio molecular dynamics trajectories will be paramount to make this approach more common in the near future.