Mindaugas
Macernis
*
Institute of Chemical Physics, Vilnius University, Saulėtekio av. 3, Vilnius, LT-10257, Lithuania. E-mail: mindaugas.macernis@ff.vu.lt
First published on 17th June 2025
Atomic orbital (AO) normalization is a foundational assumption in electronic structure theory, yet in practice, the norm of contracted basis functions can deviate from unity due to internal reduction and transformation mechanisms applied by quantum chemistry packages. This work presents a systematic framework for analyzing the physical and numerical consequences of primitive basis function elimination and AO-level norm inconsistency. The implemented methodology quantifies norm loss, separates constructive and destructive contributions, and enables precise renormalization by retaining both positive and negative terms within AO representations. Using two representative systems—a Raman-active carotenoid (lycopene) and a phosphorus dimer with through-space J(P–P) coupling—sensitivity to AO normalization was evaluated. While vibrational frequencies remained stable across normalization schemes, Raman intensities and J-coupling constants showed non-negligible shifts: up to 6 Hz for phosphorus and over 50 units in Raman activity. The results demonstrate that AO normalization is not merely a numerical refinement, but a physically impactful step with implications for precision spectroscopy and quantum computing applications.
Gaussian-type orbitals (GTOs), centered on atomic nuclei, are commonly used as basis functions in the linear combination of atomic orbitals (LCAO) method.1,3,4 Generally contracted basis sets include the correlation-consistent (cc) families proposed by Dunning & coworkers,5–12 widely used in the form of cc-pVXZ (X = D, T, Q, 5, 6), where X denotes the number of contracted Gaussian-type orbitals (CGTOs) used to represent each occupied atomic orbital. These sets can be systematically extended with diffuse functions, denoted by aug-, d-aug-, and t-aug-prefixes for singly, doubly, and triply augmented variants.
Another general contraction approach is the atomic natural orbital (ANO) basis set, in which the contraction coefficients are obtained by optimizing atomic energies.13 Optimized atomic Slater-type orbitals (STOs) were often approximated by fixed linear combinations of Gaussian functions,14 forming the basis of widely used Pople-type basis sets such as 3-21G and 6-311G.15–17 Basically, molecular orbitals have a functional2,18 form given by equation:
![]() | (1) |
The effects of basis set reduction have been extensively studied across multiple quantum chemical methods—including density functional theory (DFT), Hartree–Fock (HF), and coupled-cluster (CC)-to improve accuracy or reduce computational cost.18–34 However, studies targeting high-precision results26,30,31,35–46 or analysing basis set superposition errors (BSSE)35–37,40,43,46 often face reproducibility challenges due to undocumented and automatically applied normalization procedures, as demonstrated in this work. Notably, the widely used cc-pVDZ basis set provides a clear example of predefined reductions embedded in default system libraries, despite its frequent use in high-accuracy applications.30,39–42,44–46 For this reason, cc-pVDZ was selected as a test case to evaluate how AO reduction and normalization affect computed results.
This paper introduces a framework for evaluating the physical and numerical effects of primitive Gaussian basis function elimination. Norm loss and component-level analysis reveal that atomic orbital (AO) pruning—often applied automatically and without user control—can impact computed molecular properties. AO normalization is revisited through the separation of positive and negative contraction contributions, representing constructive and destructive superposition components.
A case study using the cc-pVDZ5,47 basis set is presented for hydrogen, carbon, and phosphorus, elements selected for their typical angular momentum behaviour and susceptibility to internal reduction in system libraries. Calculations were performed using Gaussian,48 which enables explicit basis input and reflects normalization behaviour during SCF processing. The study assesses the impact of normalization strategies on total energies, dipole moments, Raman intensities, and J(P–P) couplings.
Two representative systems were selected to assess the physical implications of AO normalization: lycopene (LYC), a conjugated carotenoid (Car) with well-defined Raman-active modes,49–53 and bis(diphenylphosphino)methane (dppm), which enables through-space P–P spin–spin coupling relevant to quantum computing54–58 and other studies.59 These molecules provide sensitivity benchmarks for vibrational intensities and sub-Hz nuclear coupling constants.
All results are based on a reproducible methodology implemented in the open-source tool BasisSculpt, which supports precise and controlled AO normalization.
![]() | (2) |
The reduction procedure involves eliminating a single atomic orbital component, which ideally should only affect the total norm at the level of numerical noise or computational precision. For illustration, it can be considered reducing a three-term expansion to two terms:
ϕfull = c1N1e−α1r2 + c2N2e−α2r2 + c3N3e−α3r2 | (3) |
ϕreduced = c1N1e−α1r2 + c2N2e−α2r2 | (4) |
Renormalization should result in the integrals being almost equal:
![]() | (5) |
However, the effect of the reduction is lost in this case, as both integrals are normalized to one. Therefore, the most informative approach is to compare the original and reduced integrals before renormalization. This allows for quantifying the relative contribution of the removed term as a percentage.
![]() | (6) |
This raises the question of whether eliminating negative contraction coefficients is justified, given their suppressive contributions. In the normalization procedure, their impact is expected to be negligible if the coefficients are very small, and the total energy should remain unaffected. In such cases, renormalization may not be physically meaningful. However, when multiple negative contraction coefficients are involved, the function can be separated into constructive and destructive components.
ϕ(r) = ϕ+(r) + ϕ−(r) | (7) |
If renormalization is applied to only one component, the imbalance causes the remaining component to dominate, degrading the function's physical accuracy. Thus, normalization must be applied proportionally to all components to preserve the functional balance.
s+2·‖ϕ+‖2 + s−2·‖ϕ−‖2 + 2s+s−·〈ϕ+,ϕ−〉 = 1 | (8) |
A computational simplification involves splitting the normalization into two steps and verifying consistency between pre- and post-normalization values.
![]() | (9) |
![]() | (10) |
However, it must be noted that
‖ϕ‖2 ≠ ‖ϕ+‖2 + ‖ϕ−‖2 | (11) |
〈ϕ+,ϕ−〉 | (12) |
From this point, a proportional analysis was performed to assess whether the atomic orbitals (AOs) in the basis set are appropriately normalized. This involves evaluating the total contribution of all Gaussian primitives constructing a given AO, regardless of whether the AO is defined across multiple blocks (e.g., multiple entries for S orbitals).
However, when there are repeated α values across blocks the entire block can be reduced. In order to take this into account, the intra-block reduction effects on all selected AO blocks were treated as one joined block (Join-block). This technique allows calculation of the contribution of individual primitive Gaussian functions to the AO, based on their grouping into a Join-block.
Norm reduction was expressed as a percentage relative to the originally selected full block. The current analysis focuses on hydrogen (H), carbon (C), and phosphorus (P) atoms, which were chosen to investigate the effects of basis set normalization on molecular properties in the selected systems. The analysis was performed with cc-pVDZ taken from the basis set exchange.60
We used different cc-pVDZ basis set normalization approaches A1, A2, A3Exc and A4BS:
A1 – the typical basis set normalization as implemented in Gaussian software;48
A2 – the basis set normalization as implemented in Gaussian software by using the keyword which prevents basis set reduction;48
A3Exc – the typical basis set normalization as implemented in software48 but with the basis set provided from the basis set exchange (BSE);60
A4BS – renormalized basis set by saving AO positive and negative contributions with BasisSculpt implementation by taking the basis set provided from the basis set exchange.60
The analysed normalization approaches – A1, A2, A3Exc, and A4BS – apply different reduction strategies. A3Exc uses the original cc-pVDZ basis set without any reduction. A4BS retains both constructive and destructive components after explicit normalization. In contrast, A2 provided by default within the internal system library, includes only initial reductions without renormalization. A1 applies48 a more aggressive reduction strategy by using an A2-type basis set.
It should be noted that the initial basis set for A1 and A2 is the basis set from the internal Gaussian software database which has reduction applied already, for example: hydrogen has 3 α in the first block (AO labelled as S) while in A3Exc from BSE it has 4; carbon has 8 α in the first two blocks (AO labelled as S) while in BSE it is reported as 9, the forth block (AO labelled as P) has 3 α while in BSE there are 4; phosphorus has 11 α in the first three blocks (AO labelled as S) while in BSE it is reported as 12, the fifth and sixth blocks (AO labelled as P) have 7 α while in BSE there are 8.
The next AO-labelled S block for carbon, although having the same α values, exhibits different reductions in the A1-type basis set normalization due to differing contraction coefficients. Carbon exhibits a reduction at α = 1000, α = 228 and α = 0.1596 in the block, whose contributions to the block are 6.324%, 10.352% and 4.517%, respectively, and causes a 0.007%, 0.079% and 75.216% loss in normalization. When considering the Join-block, the contributions change to 1.274%, 2.086% and 0.91%, and the corresponding normalization loss changes to −0.001%, −0.016% and 19.677%. Initially, the block consists of 2 primitives forming the constructive part ϕ+(r), and 7 forming the destructive part ϕ−(r). After reduction, the block consists of 3 constructive components and 4 destructive components. In the internal system library (A2), all 8 components are present, with 1 constructive component reduced (Table 1).
A | ϕ (l) | α | δ loss, % | δ contribution, % | ||
---|---|---|---|---|---|---|
Block | Join-block | Block | Join-block | |||
H | S | 13.01 | 0.9082 | 0.3129 | 18.5385 | 7.8348 |
1.962 | 15.6778 | 6.1664 | 31.4458 | 13.2897 | ||
0.4446 | 68.0162 | 29.8528 | 35.7908 | 15.1260 | ||
0.122 | 65.4686 | 29.0632 | 14.2248 | 6.0117 | ||
S | 0.122 | — | 51.5294 | — | 11.9937 | |
P | 0.727 | — | 50.7382 | — | 45.7441 | |
C | S | 6665 | 0.0053 | 0.0002 | 4.7585 | 3.1704 |
1000 | 0.1478 | 0.0057 | 8.8341 | 5.8857 | ||
228 | 1.7935 | 0.0730 | 14.8105 | 9.8675 | ||
64.71 | 11.7803 | 0.5400 | 21.6343 | 14.4138 | ||
21.06 | 41.2350 | 2.4047 | 25.1787 | 16.7753 | ||
7.495 | 67.2472 | 6.0494 | 18.9417 | 12.6199 | ||
2.797 | 39.7860 | 6.0679 | 5.7477 | 3.8294 | ||
0.5215 | 1.1944 | 0.5472 | 0.0870 | 0.0579 | ||
0.1596 | −0.1204 | −0.1151 | 0.0075 | 0.0050 | ||
S | 6665 | 0.0002 | −0.0000 | 3.3191 | 0.6689 | |
1000 | 0.0068 | −0.0013 | 6.3244 | 1.2746 | ||
228 | 0.0790 | −0.0161 | 10.3523 | 2.0863 | ||
64.71 | 0.5217 | −0.1346 | 16.3915 | 3.3034 | ||
21.06 | 1.3848 | −0.6407 | 19.3767 | 3.9050 | ||
7.495 | −0.9100 | −2.3578 | 20.9376 | 4.2196 | ||
2.797 | −7.2674 | −2.9047 | 8.4826 | 1.7095 | ||
0.5215 | 63.2844 | 18.5204 | 10.2985 | 2.0755 | ||
0.1596 | 75.2160 | 19.6771 | 4.5174 | 0.9104 | ||
S | 0.1596 | — | 32.3308 | — | 1.5683 | |
P | 9.439 | 2.4915 | 0.5202 | 20.6336 | 1.2746 | |
2.002 | 25.0107 | 5.1258 | 35.4482 | 2.1897 | ||
0.5456 | 69.0712 | 17.2413 | 32.4600 | 2.0052 | ||
0.1517 | 57.4163 | 15.9485 | 11.4583 | 0.7078 | ||
P | 0.1517 | — | 32.0338 | — | 1.5097 | |
D | 0.55 | — | 32.0229 | — | 3.9667 | |
P | S | 94![]() |
0.0011 | 0.0000 | 3.1353 | 1.8169 |
14![]() |
0.0330 | 0.0009 | 5.8598 | 3.3958 | ||
3236 | 0.4426 | 0.0127 | 10.0105 | 5.8011 | ||
917.1 | 3.5326 | 0.1063 | 15.6963 | 9.0962 | ||
299.5 | 17.2745 | 0.5714 | 21.5746 | 12.5026 | ||
108.1 | 47.9026 | 1.9171 | 23.4950 | 13.6156 | ||
42.18 | 63.5880 | 3.5418 | 15.8078 | 9.1607 | ||
17.28 | 31.6595 | 2.6245 | 4.2889 | 2.4855 | ||
4.858 | 1.5508 | 0.2862 | 0.1219 | 0.0706 | ||
1.818 | −0.1362 | −0.0578 | 0.0090 | 0.0052 | ||
0.3372 | 0.0122 | 0.0239 | 0.0008 | 0.0004 | ||
0.1232 | −0.0026 | −0.0102 | 0.0002 | 0.0001 | ||
S | 94![]() |
3.1353 | −0.0000 | 2.1813 | 0.4956 | |
14![]() |
5.8598 | −0.0003 | 4.0366 | 0.9171 | ||
3236 | 10.0105 | −0.0036 | 7.0493 | 1.6016 | ||
917.1 | 15.6963 | −0.0310 | 11.0007 | 2.4994 | ||
299.5 | 21.5746 | −0.1898 | 16.3825 | 3.7222 | ||
108.1 | 23.4950 | −0.7379 | 19.3444 | 4.3951 | ||
42.18 | 15.8078 | −2.0254 | 18.9770 | 4.3116 | ||
17.28 | 4.2889 | −1.4660 | 5.6376 | 1.2809 | ||
4.858 | 0.1219 | 8.2875 | 9.8274 | 2.2328 | ||
1.818 | 76.9352 | 12.6235 | 5.4570 | 1.2398 | ||
0.3372 | 3.8829 | 1.1728 | 0.0945 | 0.0215 | ||
0.1232 | −0.5893 | −0.2986 | 0.0117 | 0.0027 | ||
S | 94![]() |
0.0000 | 0.0000 | 1.8084 | 0.1360 | |
14![]() |
0.0002 | 0.0001 | 3.3552 | 0.2523 | ||
3236 | 0.0025 | 0.0010 | 5.8412 | 0.4392 | ||
917.1 | 0.0190 | 0.0084 | 9.1741 | 0.6897 | ||
299.5 | 0.0886 | 0.0507 | 13.6330 | 1.0250 | ||
108.1 | 0.1576 | 0.1969 | 16.4258 | 1.2349 | ||
42.18 | −0.3314 | 0.5350 | 16.3365 | 1.2282 | ||
17.28 | −0.7092 | 0.4414 | 5.3130 | 0.3994 | ||
4.858 | 1.3662 | −3.4933 | 11.0748 | 0.8326 | ||
1.818 | −19.393 | −8.8649 | 10.1957 | 0.7665 | ||
0.3372 | 60.5582 | 18.7893 | 4.8343 | 0.3635 | ||
0.1232 | 70.7104 | 16.0662 | 2.0079 | 0.1510 | ||
S | 0.1232 | — | 27.8086 | — | 0.2736 | |
P | 370.5 | 0.1269 | 0.0168 | 5.3800 | 0.4389 | |
87.33 | 2.2678 | 0.2291 | 13.9373 | 1.1370 | ||
27.59 | 15.5405 | 1.3645 | 25.1540 | 2.0521 | ||
10 | 47.1860 | 4.4096 | 29.7118 | 2.4239 | ||
3.825 | 64.8942 | 7.9124 | 20.1593 | 1.6446 | ||
1.494 | 33.4986 | 5.8694 | 5.5160 | 0.4500 | ||
0.3921 | 1.4516 | 0.5302 | 0.1349 | 0.0110 | ||
0.1186 | −0.0870 | −0.0632 | 0.0068 | 0.0006 | ||
P | 370.5 | 0.0060 | −0.0041 | 4.2919 | 0.1067 | |
87.33 | 0.0929 | −0.0547 | 10.7575 | 0.2673 | ||
27.59 | 0.4846 | −0.3601 | 20.8513 | 0.5182 | ||
10 | 0.1021 | −1.1660 | 23.6949 | 0.5888 | ||
3.825 | −5.1057 | −2.5715 | 19.5532 | 0.4859 | ||
1.494 | −0.7868 | −0.2182 | 0.6515 | 0.0162 | ||
0.3921 | 66.5573 | 16.0722 | 14.1101 | 0.3506 | ||
0.1186 | 75.0254 | 16.4273 | 6.0896 | 0.1513 | ||
P | 0.1186 | — | 27.6062 | — | 0.2659 | |
D | 0.3730 | — | 28.7026 | — | 0.6280 |
In the case of larger basis sets, greater and less intuitive reductions can be expected. For example, the carbon cc-pVTZ5,6 case also shows a significant imbalance between constructive and destructive components, as presented in the ESI.†
The second AO-labelled S block for phosphorus should have the same α values as for the first S block, but exhibits different reductions in the A1-type basis set normalization due to differing contraction coefficients. Phosphorus exhibits a reduction at α = 94840, α = 14
220, α = 0.3372 and α = 0.1232 in the block, which results in contributions to the block of 2.181%, 4.037%, 0.095% and 0.012%, respectively, and causes a 3.135%, 5.86%, 3.883% and −0.589% loss in normalization. When considering the Join-block, the contributions change to 0.496%, 0.917%, 0.022% and 0.003%, and the corresponding normalizations loss change to >−0.001%, >−0.001%, 1.173% and −0.299%. Initially, the block consists of 4 primitives forming the constructive part ϕ+(r), and 8 forming the destructive part ϕ−(r). After reduction, the block consists of 2 constructive components and 6 destructive components and this number is smaller by 1 α value compared with the first AO-labelled S block. In the internal system library (A2), all 11 components are present, with 1 destructive component reduced.
The third AO-labelled S block for phosphorus should also have the same α values as for the first S block, but exhibits different reductions in the A1-type basis set normalization due to differing contraction coefficients also. Phosphorus exhibits a reduction at α = 94840, α = 14
220, α = 299.5 and α = 0.1232 in the block, with contributions to the block of 2.808%, 3.355%, 13.633% and 2.008%, respectively, and causes a >0.001%, >0.001%, 0.089% and 70.71% loss in normalization. When considering the Join-block, the contributions change to 0.136%, 0.252%, 1.025% and 0.151%, and the corresponding normalizations loss changes to >0.001%, >0.001%, 0.051% and 16.067%. Initially, the block consists of 10 primitives forming the constructive part ϕ+(r), and 2 forming the destructive part ϕ−(r). After reduction, the block consists of 5 constructive components and 3 destructive components. In the internal system library (A2), all 11 components are present, with 1 constructive component reduced.
The first AO-labelled P block for phosphorus exhibits a reduction at α = 0.3921 and α = 0.1186, in the block, with contributions to the block of 0.135% and 0.007%, respectively, and causes a 1.452% and −0.087% loss in normalization. When considering the Join-block, the contributions change to 0.011% and 0.006%, and the corresponding normalizations loss changes to 0.53% and −0.063%. Initially, the block consists of 7 primitives forming the constructive part ϕ+(r), and 1 forming the destructive part ϕ−(r). After reduction, the block consists of 6 constructive components without destructive components. In the internal system library (A2), all 7 components are present, with 1 destructive component reduced.
The second AO-labelled P block for phosphorus should have the same α values as for the first P block also, but exhibits different reductions in the A1-type basis set normalization due to differing contraction coefficients. Phosphorus exhibits a reduction at α = 370.5 and α = 0.1186 in the block, and its contributions to the block are 4.292% and 6.09%, respectively, and it causes a 0.006% and 75.025% loss in normalization. When considering the Join-block, the contributions change to 0.107% and 0.151%, and the corresponding normalizations loss changes to −0.004% and 16.427%. Initially, the block consists of 2 primitives forming the constructive part ϕ+(r), and 6 forming the destructive part ϕ−(r). After reduction, the block consists of 4 constructive components and 2 destructive components. In the internal system library (A2), all 7 components are present, with 1 constructive component reduced. This component results in a 75.025% loss in normalization, despite contributing only 6.09%. In contrast, destructive components with lower percentage contributions are not eliminated.
Using BasisSculpt, the actual norms of AO blocks used in the A1-type calculations were evaluated. Despite Gaussian applying internal normalization (A1), the resulting norms were not consistently equal to one. For example, the carbon atom P block reached 1.08, while phosphorus atom P blocks ranged from 1.01 to 1.13. These findings confirm that internal normalization in Gaussian software includes amplitude scaling, particularly for angular momentum components, which may affect sensitive properties.
![]() | ||
Fig. 1 Raman spectrum of lycopene showing three main peaks, with the chemical structure displayed below. |
Ground state, dipole moment, UV spectra, frequency and Raman activity calculations for LYC were performed using DFT (B3LYP/cc-pVDZ) with four different basis set normalization strategies: A1 (default Gaussian), A2 (system library), A3Exc (original cc-pVDZ from basis set exchange), and A4BS (BasisSculpt-renormalized). The LYC structures were optimized with the same DFT methodology as results analysed.
In all cases, the dipole moment (field-independent) remained constant at 0.5882 Debye. The total energy was identical for A1, A2, and A3Exc (−1557.93462692 Hartree), while for A4BS it differed slightly by −0.0001533472 eV, which is within the numerical noise range (0.0001 Hartree). Zero-point energy corrections were effectively the same, with A4BS differing only by 0.000001 Hartree per particle.
UV spectra were calculated using time-dependent density functional theory (TDDFT). In all cases, the lowest excited states were identical; for example, S1 was calculated at 2.0866 eV, with an oscillator strength of 4.3861 and a major contribution (0.7104) from HOMO to LUMO.
Taking A1 as the reference, the ν1 frequency (1553.6621 cm−1) decreased by 0.0001 cm−1 for A2, increased by 0.0001 cm−1 for A3Exc, and increased by 0.0017 cm−1 for A4BS. The corresponding Raman activity changes were −1.1379, −12.1227, and +5.088, respectively.
For ν2 (1191.1613 cm−1), the frequency decreased by 0.0002 cm−1 for A2, decreased by 0.0001 cm−1 for A3Exc, and increased by 0.0008 cm−1 for A4BS. The Raman activity differences were +1.5469, +6.9930, and +1.5918.
For ν3 (1012.7001 cm−1), all basis sets yielded the same frequency, with a minor decrease of 0.0001 cm−1 for A4BS. Raman activity shifts were +3.2633, +35.0093, and +51.2232.
These results show that even minor deviations in AO normalization – particularly when destructive components are either retained or removed – can lead to measurable differences in Raman intensities, despite the fact that all calculations involved additional renormalization and yielded virtually identical total energies, UV spectra and vibrational frequencies.
Ground-state properties, dipole moments, and J-coupling values for dppm were calculated using DFT (B3LYP/cc-pVDZ) under four different basis set normalization strategies: A1 (default Gaussian), A2 (system library), A3Exc (original cc-pVDZ from basis set exchange), and A4BS (BasisSculpt-renormalized). Geometry optimization was performed with A1 to ensure consistent phosphorus–phosphorus distances across the first comparison group. To fully evaluate the effect of AO normalization on structure itself, an additional geometry optimization was performed using the A4BS basis set (labelled A4BSopt).
As expected, the field-independent dipole moment remained constant (1.6010 Debye) for A1, A2, A3Exc, and A4BS. However, in the A4BSopt case, the dipole moment increased to 1.6153 Debye. This change corresponds to a slight increase in the P–P distance by ∼0.01 Å (from 3.13586 Å to 3.14435 Å) indicating subtle structural rearrangements in the dppm geometry associated with renormalized AO amplitudes.
The total energies were identical (−1648.69310726 Hartree) for A2 and A3Exc, A1 differed marginally (−1648.69310645 Hartree), remaining within numerical noise. In contrast, A4BS and A4BSopt exhibited a decrease in total energy by 0.031 eV and 0.095 eV, respectively.
Total nuclear spin–spin coupling constants J(P–P) for A1, A2, A3Exc, and A4BS were found to be 100.89 Hz, 101.033 Hz, 101.034 Hz, and 101.247 Hz, respectively. This means that even at a fixed geometry (3.13586 Å), the J value varied by up to 0.4 Hz depending solely on the AO normalization strategy. In the A4BSopt case, where structural relaxation was allowed under renormalized AO functions, the J-coupling decreased more substantially to 95.322 Hz – highlighting a 6 Hz shift solely due to basis set normalization effects.
The results demonstrate that AO normalization is not merely a numerical refinement, but a physically impactful operation that directly influences computed spin–spin couplings. For systems involving heavy atoms such as phosphorus and properties requiring sub-Hz resolution, such as J(P–P), strict control over AO norms becomes essential. The observed variations confirm that even when total energies remain nearly identical, the shape and amplitude of AO functions – especially in normalized or destructively altered blocks – can significantly affect sensitive observables. Future work should explore scenarios in which package-level renormalization is fully disabled, as this may substantially alter the resulting physical predictions and reveal normalization-dependent effects more explicitly.
![]() | (13) |
This is a contracted atomic orbital (AO), composed of primitive GTOs with primitive exponents αk, normalization coefficients Nk, and contraction coefficients ck, typically provided in the S and P (SP, D etc.) blocks of the basis set file. Each contracted block is expected to be normalized to unity.2
For the chosen basis set sensitivity was calculated with energies, Raman spectra, and UV spectra of lycopene. The J-coupling was calculated for bis(diphenylphosphino)methane (dppm) as it has P–P interactions. All structures were optimised with the B3LYP61 functional. All computations were performed by using Gaussian 16 Rev C.01.48
Projection-based normalization, incorporating both constructive and destructive contributions, was shown to be essential for preserving the physical form of AO blocks. The results reveal that default package-level normalization – such as in Gaussian – can lead to amplitude scaling, with AO norms varying up to ±13% across S and P blocks.
The vibrational frequency shifts remain negligible under different normalization schemes, but Raman intensities and spin–spin couplings (e.g., J(P–P)) exhibit differences of up to 6 Hz. The AO normalization directly affects second- and fourth-order properties, which are critical for spectroscopic interpretation and quantum technology applications.
While basis set reduction does not significantly affect total energies or UV-vis spectra for larger molecules (e.g., >20 atoms), the results of this study suggest otherwise for Raman intensities and spin–spin coupling constants. AO normalization influences the radial amplitude and curvature of contracted atomic orbitals, particularly near the nucleus. When constructive or destructive interference patterns are altered through primitive elimination or renormalization, the local electron density at the nucleus can change – even if the total molecular energy remains stable. These localized changes directly impact properties such as Fermi contact terms in NMR and Raman activity, which depend sensitively on electronic response near nuclei. This suggests that higher-order response properties, such as hyperpolarizabilities, may be even more susceptible to AO norm deviations and should be treated with particular care when using reduced or automatically normalized basis sets.
On the other hand, the use of larger basis sets – especially in contexts requiring high precision – is expected to increase. However, predefined basis set reduction may unintentionally degrade accuracy in such cases. For example, hydrogen bond interaction analysis presented in ref. 62 demonstrated accuracy limitations when reduction affects delicate intermolecular features.
Future work should focus on enabling control over, or fully disabling, internal normalization mechanisms in electronic structure packages to ensure full transparency and reproducibility in basis set behaviour. As the results indicate, it is sometimes insufficient to report only the DFT method and package version, as even different deployments of the same software version may apply distinct internal basis set reduction procedures. This highlights the need to explicitly report the applied basis set normalization strategy alongside computational results.
While the proposed framework in this study is based on contracted Gaussian primitives and is therefore formally compatible with ANO-type basis sets, its practical application requires further adjustments. ANO constructions often involve irregular contraction patterns, unbalanced block structures, and uncontracted diffuse functions optimized for correlated methods. Accordingly, block-wise normalization, primitive elimination, and contribution analysis must be adapted to handle flexible contraction schemes and extended numerical integration domains. These adjustments are technical in nature and do not limit the conceptual generality of the approach.
The approach proposed in this study is not intended as an alternative to normalization, but rather a method to preserve the constructive and destructive character of the AO block components during renormalization. Whether these contributions are critical depends not only on atomic type (e.g., heavy elements) or molecular polarity, but also on the specific structure of the basis set – particularly in cases where destructive components compensate for large positive amplitudes. In such cases, improper reduction or reshaping may distort the intended shape of the contracted orbital.
Finally, it is important to emphasize that nearly all major quantum chemistry packages—including TURBOMOLE,63 NWChem,64 ORCA,65–67 Q-Chem,68 Molpro,69,70 Dalton71 and GAMESS72—perform internal normalization of user-supplied basis functions. Moreover, most quantum chemistry software packages include predefined reductions in their internal basis set databases. This is not specific to Gaussian 16 Rev. C.01; similar reductions are also present in the internal basis set database of NWChem. While NWChem documentation states that the basis set content is the responsibility of the system administrator, in practice, this means that the basis set database may vary depending on the specific installation or user's environment. Consequently, different installations of the same software version may include different internal basis sets. A similar situation applies to Gaussian, where internal corrections – such as modifications to the internal basis set database by the system administrator (e.g., based on the basis set exchange) – can also vary. This leads to ambiguity regarding the origin and extent of predefined basis set reductions. This behaviour, while intended to ensure stability and consistency, may obscure the origin of differences in computed properties and complicate reproducibility unless explicitly documented. Therefore, reporting the applied normalization strategy, or disabling internal renormalization where possible, should be considered standard practice in high-precision computations.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d5cp01681a |
This journal is © the Owner Societies 2025 |