Sergei
Manzhos
*,
Shunsaku
Tsuda
and
Manabu
Ihara
*

School of Materials and Chemical Technology, Tokyo Institute of Technology, Ookayama 2-12-1, Meguro-ku, Tokyo 152-8552, Japan. E-mail: Manzhos.s.aa@m.titech.ac.jp; mihara@chemeng.titech.ac.jp; Tel: +81-(0)3-5734-3918

Received
7th September 2022
, Accepted 6th December 2022

First published on 7th December 2022

Machine learning (ML) based methods and tools have now firmly established themselves in physical chemistry and in particular in theoretical and computational chemistry and in materials chemistry. The generality of popular ML techniques such as neural networks or kernel methods (Gaussian process and kernel ridge regression and their flavors) permitted their application to diverse problems from prediction of properties of functional materials (catalysts, solid state ionic conductors, etc.) from descriptors to the building of interatomic potentials (where ML is currently routinely used in applications) and electron density functionals. These ML techniques are assumed to have superior expressive power of nonlinear methods, and are often used “as is”, with concepts such as “non-parametric” or “deep learning“ used without a clear justification for their need or advantage over simpler and more robust alternatives. In this Perspective, we highlight some interrelations between popular ML techniques and traditional linear regressions and basis expansions and demonstrate that in certain regimes (such as a very high dimensionality) these approximations might collapse. We also discuss ways to recover the expressive power of a nonlinear approach and to help select hyperparameters with the help of high-dimensional model representation and to obtain elements of insight while preserving the generality of the method.

Machine learning (ML) methods are more and more often used for this purpose as they allow one to build structure–property/performance mappings in a black-box way, with limited need for a domain-specific method or software choice. ML is also more and more used in method development for computational and theoretical chemistry. Machine-learned interatomic potentials are now routine,^{13–18} and there is good potential for machine-learned functionals for DFT (density functional theory), including exchange correlation and kinetic energy functionals, to be available in end-user codes in the near future.^{19–25}

Two classes of ML methods stand out as the most widely used in the above applications: neural networks (NNs)^{26} and kernel-based regression methods such as Gaussian process regression^{27} or kernel ridge regression.^{28} Even though a single-hidden layer NN is a universal approximator, multilayer NNs are often used, from traditional multilayer feed-forward NNs to more complex architectures such as convolutional NNs, making use of the concept of “deep learning”. The kernel methods are used for their “non-parametric” nature and have been argued to possess a higher learning power than NNs.^{29,30}

It is the generality of these ML techniques that permits their application to such diverse problems. It also allows easy use by non-experts including students. This generality comes at a price of a lack of insight provided by physically motivated models. ML techniques (and black-box methods in general) are also notoriously bad at extrapolation. This means that despite their generality, domain knowledge should be used to define descriptors in such a way that the ML method is not called in the extrapolation regime. ML approaches such as NN and other non-linear methods also require more data than physically motivated models and simple linear regressions. This can be an issue in high-dimensional spaces, as data acquisition may be costly (for example, CPU-costly ab initio calculations), and the low-density of data may not allow for quality ML. We note that data density is bound to be low in high-dimensional spaces, and this issue cannot be resolved by simply adding more data by virtue of the curse of dimensionality.^{31}

ML methods are often used “as is”, with concepts such as “non-parametric” or “deep learning“ used without a clear justification for their need or advantage over simpler and more robust alternatives such as plain linear regressions or single hidden layer NNs. ML techniques are assumed to have the superior expressive power of nonlinear methods. In this Perspective, we highlight some interrelations between popular ML techniques and traditional linear regressions and basis expansions and demonstrate that in certain regimes (such as a very high dimensionality of the feature space) these approximations might collapse. We also discuss ways to recover the expressive power of a nonlinear approach with the help of high-dimensional model representation (HDMR)^{32–35} which also allows introducing elements of insight while preserving the general nature of the method.

(2.1.1) |

Eqn (2.1.1) is an expansion over a flexible, tunable to the problem basis set {σ_{n}}. Here we use the subscript n to indicate that even when the functional form is the same for all neurons, the basis functions are different due to the effect of w_{n}, b_{n}. This view also holds for the multilayer NN

(2.1.2) |

(2.1.3) |

(2.2.1) |

f(x) = K* K^{−1}f | (2.2.2) |

(2.2.3) |

K* = (k(x, x^{(1)}), k(x, x^{(2)}), …, k(x, x^{(M)}), K** = k(x,x) | (2.2.4) |

GPR is none other than a regularized linear regression, with a Tikhonov regularization parameter δ, over the basis b_{n}(x) = k(x, x^{(n)}):^{28}

(2.2.5) |

Δf(x) = K** − K* K^{−1}K*^{T} | (2.2.6) |

(3.2.1) |

(3.2.2) |

We proposed two ways to improve hyperparameter optimization. One is the rectangularization of GPR, using fewer basis functions than training data.^{61} This is related so some of the sparse GPR methods.^{62} Instead of the square matrix of eqn (2.2.5), one can use a rectangular matrix of size M × N with elements B_{mn} = k(x^{(n)}, x^{(m)}), where N is the number of basis functions and M the number of training data. When M > N, i.e. with a rectangular version of eqn (2.2.5), it in general cannot be solved exactly, and the residual of a least-squared solution can be used to guide hyperparameter optimization to improve basis completeness,^{63–65}

(3.3.1) |

Another approach is using an additive model^{48,67,68} based on high-dimensional model representation (HDMR).^{33} We will introduce HDMR in more detail in Section 5; here it suffices to introduce a simple additive model

(3.3.2) |

The component functions f^{GPR}_{i}(x_{i}) can be built with high confidence from few data (if they are built with GPR, eqn (2.2.6) they will return very low variance precisely because the one-dimensional f^{GPR}_{i}(x_{i}) are well-defined and will generally grossly understate the fitting error f(x) − f^{add}(x) which is due to the additive approximation). f^{add}(x) can then be used to sample from it a large pool of data that can be used to optimize the hyperparameters of the high-dimensional GPR model of f(x). We have shown, on the example of fitting a 15-dimensional PES of UF_{6}, that “good enough” (albeit not perfect, due to the difference between f(x) and f^{add}(x)) hyperparameters can be identified in this way.^{49}

The data, however, may not be available on demand. The cost of their computation may be high (examples are ab initio data for systems beyond small molecules) and/or because of the nature of the problem, the data may be very unevenly distributed. An example of this kind of application is machine learning of Kohn-Sham kinetic energy density (KED) τ or its positive-definite version τ_{+} from the electron density for the construction of kinetic energy functionals (KEF) for orbital-free DFT:^{70}

(4.1) |

Fig. 1 Distributions of kinetic energy densities τ_{+} of crystalline aluminium and magnesium computed as described in ref. 19. The values are scaled to [0, 1]. |

Distributions of some density-dependent features (such as those in eqn (6.1) below) are even more extreme (see ref. 19). We found that in this case, multi-layer NNs are useful. While we were able to obtain good fits to Kohn–Sham KED of individual materials with single hidden layer NNs, it was not possible when machine learning KED from several materials simultaneously – which is needed to achieve the portability of the KEF. A multilayer NN was able to achieve an accurate fitting of the data from several materials simultaneously (Li, Mg, and Al in the occurrence, see ref. 20 for details).

(5.1.1) |

d(t + Δt) = f(x(t)) | (5.1.2) |

The example above is not from physical chemistry and is chosen to illustrate the effect of an extremely large D. As ML makes further inroads into physical and computational chemistry, it is important to keep these effects in mind. Examples of situations where extremely high-dimensional spaces that might arise are optimizations directly of grid points or basis coefficients.^{25} A way to deal with this issue is naturally to avoid using products of too many rapidly decaying functions. This can be achieved by using the high-dimensional model representation (HDMR).

(5.2.1) |

Taken to d = D, this expansion is exact; when d < D, it is an approximation. Eqn (3.3.2) is a particular case of eqn (5.2.1) for d = 1. In most real-life applications, the importance of orders of coupling, i.e. of the magnitude of the component functions f_{i1i2…id}(x_{i1}, x_{i2},…,x_{id}), drops rapidly with d.^{32} We specifically consider RS (random sampling) HDMR^{32,33} which allows constructing all f_{i1i2…id}(x_{i1}, x_{i2},…,x_{id}) from one and the same set of samples of f(x) however distributed in the D-dimensional space (the term “random sampling” should be understood in the sense of allowing any distribution rather than randomness, but we will follow the terminology used in the original HDMR literature^{32–35}). This is not to be confused with the N-mode representation^{72} which has the same form of eqn (5.2.1) and where the component functions are sampled on sub-dimensional hyperplanes passing through an expansion centre and therefore requiring a separate dataset for each term (the N-mode approach is a particular case of HDMR called cut-HDMR^{33}).

The advantage of an HDMR form with d < D is that lower-dimensional terms are easier to construct and are easier to use in applications (e.g. when integration is required). A major advantage of HDMR is that lower-dimensional terms can be reliably recovered from fewer data.^{32,47,49,68} As sampling in multidimensional spaces is bound to be sparse, this is attractive for multi-dimensional problems. The original formulation of RS-HDMR required computing f_{i1i2…id}(x_{i1}, x_{i2},…,x_{id}) as (D–d)-dimensional integrals, which may be very costly.^{32,33} Some of us previously introduced combinations of HDMR with neural networks (RS-HDMR-NN)^{73–75} and, recently, with Gaussian process regressions (RS-HDMR-GPR)^{47,68} which allow dispensing with integrals and also allow combining terms of any dimensionality, e.g. one may lump terms with d′ < d into d-dimensional terms, in which case the approximation becomes

(5.2.2) |

Specifically, when using GPR in multi-dimensional spaces, HMDR allows using lower-dimensional kernels and avoiding some of the problems associated with Matern-type kernels with very high D. To achieve the approximation eqn (5.2.2) with GPR, one can define a custom kernel which is itself in an HDMR form:^{48,67}

(5.2.3) |

(5.2.4) |

Small values of l allow preserving the higher expressive power of a nonlinear method, as can be seen in Fig. 3, where we show an example of a component function f_{i}(x_{i}) of a 1st order HDMR-GPR model (eqn (3.2.2)) achieved with different values of the length parameter of the square exponential kernel. When l is large, the component functions are practically linear, and the model becomes equivalent to a plain linear regression f(x) = cx. Smaller values of l, enabled by the HDMR structure's avoiding a product of many functions with values smaller than 1, allow the HDMR-GPR model to construct the most suitable, non-linear, basis functions that are the component functions f_{i}(x_{i}).

The combination of HDMR with GPR allows assessing the relative importance of different combinations of variables by comparing the length parameters of the kernels of different component functions. For example, in ref. 67 and 68 kinetic energy densities τ of Al, Mg, and Si were fitted with HDMR-GPR as a function(al) of the terms of the 4th order gradient expansion^{76} and the product of electron density ρ(r) and the Kohn–Sham effective potential V_{eff}(r):

(6.1) |

(6.2) |

Features | Var{f_{i1,i2}} |
l |
---|---|---|

x
_{1}, x_{2} |
3.36 × 10^{−2} |
4.94 × 10^{−1} |

x
_{1}, x_{3} |
4.51 × 10^{−4} |
3.33 × 10^{2} |

x
_{1}, x_{4} |
3.69 × 10^{−9} |
10^{5} |

x
_{1}, x_{5} |
3.70 × 10^{−9} |
10^{5} |

x
_{1}, x_{6} |
3.68 × 10^{−9} |
10^{5} |

x
_{1}, x_{7} |
3.94 × 10^{−9} |
10^{5} |

x
_{2}, x_{3} |
5.09 × 10^{−3} |
2.16 × 10^{1} |

x
_{2}, x_{4} |
3.11 × 10^{−10} |
10^{5} |

x
_{2}, x_{5} |
2.81 × 10^{−10} |
10^{5} |

x
_{2}, x_{6} |
1.65 × 10^{−10} |
10^{5} |

x
_{2}, x_{7} |
6.03 × 10^{−3} |
3.12 × 10^{1} |

x
_{3}, x_{4} |
9.87 × 10^{−10} |
10^{5} |

x
_{3}, x_{5} |
9.95 × 10^{−10} |
10^{5} |

x
_{3}, x_{6} |
1.03 × 10^{−9} |
10^{5} |

x
_{3}, x_{7} |
3.53 × 10^{−2} |
9.51 × 10^{-2} |

x
_{4}, x_{5} |
6.70 × 10^{−10} |
10^{5} |

x
_{4}, x_{6} |
4.76 × 10^{−10} |
10^{5} |

x
_{4}, x_{7} |
5.08 × 10^{−10} |
10^{5} |

x
_{5}, x_{6} |
1.38 × 10^{−10} |
10^{5} |

x
_{7}, x_{7} |
1.78 × 10^{−10} |
10^{5} |

x
_{6}, x_{7} |
1.52 × 10^{−2} |
2.48 × 10^{-1} |

Some component functions have a very high l and a very low variance – these can effectively be excluded as a result of this analysis, alleviating thereby the issue of the combinatorial scaling of the number of HDMR terms. It was also shown that the importance of terms depends on the amount of available training data, highlighting the issue that the density of sampling determines the number of coupling terms that can be recovered.^{73} In that work, isotropic kernels were used for each component function; using anisotropic kernels would provide even more detail about the relative importance of variables within different subsets. The information obtained about the relative importance of subsets of features with HDMR-GPR can be used independently for building approximations relying on the most important subsets with any method (not necessarily ML-based).

In this Perspective, we tried to bring attention to some interconnections as well as tricky parts of commonly used methods, specifically focusing on neural networks and kernel-based regression methods which found the widest use to date in the above-mentioned applications. Despite the popularity of multilayer NNs (“deep learning”), in our experience with ML of interatomic potentials, kinetic energy densities, as well as in other applications, single hidden NNs are sufficient and more efficient unless the data distribution is very uneven. Both NNs and kernel-type regressions (considered here on the example of GPR) can be viewed as expansions over parameterized, non-direct product bases. Both allow achieving a sum-of-product representation which is very useful when computing integrals (in quantum dynamics applications and elsewhere). The basis set of GPR is much less flexible than that formed by NN neurons, and typically many more basis functions (as many as there are training data) are needed compared with the number of NN neurons for the same quality of regression. The perceived ability of GPR to achieve a similar test set error (as a NN) with fewer data reported before has to do with the robustness of a linear regression that is GPR. GPR is not necessarily advantageous with respect to the total number of parameters (linear and nonlinear) although it is advantageous over NNs in terms of an only small number of nonlinear (hyper)parameters typically used. The square nature of the linear problem in the standard GPR is not well-suited for hyperparameter optimization; rectangularization of GPR equations facilitates hyperparameter optimization.

When the dimensionality of the feature space is very large, GPR with common Matern kernels may fail when the distribution of the test data is any different from that of the training data. In this case, using representations with lower-dimensional terms, based in particular on HDMR, is useful. It is also useful because with a low density of training data (which is always the case in high-dimensional spaces and cannot be fixed by simply adding however many more data, because of the curse of dimensionality), only lower-order coupling terms can be recovered. Representations with lower-dimensional terms built with ML methods are relatively easy to construct (e.g. by defining a GPR kernel in an HDMR form). They can also facilitate hyperparameter optimization. They also allow obtaining elements of insight (the sore point of black-box methods) while preserving the generality of the method, in particular, informing on the relative importance of difference combinations of features. Random sampling HDMR, which allows constructing all coupling terms from one and the same dataset, should be explored for applications with VSCF and VCI as it has the potential to significantly simplify PES construction in the form needed by those methods. Overall, there is an advantage in going beyond off-the-shelf methods; one can achieve more powerful approaches when using these methods as a base for more involved approaches such as HDMR-NN or HDMR-GPR combinations.

- Q. Tong, P. Gao, H. Liu, Y. Xie, J. Lv, Y. Wang and J. Zhao, J. Phys. Chem. Lett., 2020, 11, 8710–8720 CrossRef CAS PubMed.
- W. P. Walters and R. Barzilay, Acc. Chem. Res., 2021, 54, 263–270 CrossRef CAS PubMed.
- R. Ramprasad, R. Batra, G. Pilania, A. Mannodi-Kanakkithodi and C. Kim, npj Comput. Mater., 2017, 3, 1–13 CrossRef.
- A. Y.-T. Wang, R. J. Murdock, S. K. Kauwe, A. O. Oliynyk, A. Gurlo, J. Brgoch, K. A. Persson and T. D. Sparks, Chem. Mater., 2020, 32, 4954–4965 CrossRef CAS.
- K. T. Butler, D. W. Davies, H. Cartwright, O. Isayev and A. Walsh, Nature, 2018, 559, 547–555 CrossRef CAS PubMed.
- S. M. Moosavi, K. M. Jablonka and B. Smit, J. Am. Chem. Soc., 2020, 142, 20273–20287 CrossRef CAS PubMed.
- M. del Cueto and A. Troisi, Phys. Chem. Chem. Phys., 2021, 23, 14156–14163 RSC.
- S. Manzhos and M. Ihara, PhysChemComm, 2022, 2, 72–95 CrossRef.
- S. R. Kalidindi, J. Appl. Phys., 2020, 128, 041103 CrossRef.
- S. Li, Y. Liu, D. Chen, Y. Jiang, Z. Nie and F. Pan, Wiley Interdiscip. Rev.: Comput. Mol. Sci., 2022, 12, e1558 CAS.
- P. Schlexer Lamoureux, K. T. Winther, J. A. Garrido Torres, V. Streibel, M. Zhao, M. Bajdich, F. Abild-Pedersen and T. Bligaard, ChemCatChem, 2019, 11, 3581–3601 CrossRef CAS.
- S. Palkovits, ChemCatChem, 2020, 12, 3995–4008 CrossRef CAS.
- J. Behler, J. Chem. Phys., 2016, 145, 170901 CrossRef PubMed.
- J. Behler, Int. J. Quantum Chem., 2015, 115, 1032–1050 CrossRef CAS.
- S. Manzhos and T. Carrington, Chem. Rev., 2021, 121, 10187–10217 CrossRef CAS PubMed.
- S. Manzhos, R. Dawes and T. Carrington, Int. J. Quantum Chem., 2015, 115, 1012–1020 CrossRef CAS.
- I. Poltavsky and A. Tkatchenko, J. Phys. Chem. Lett., 2021, 12, 6551–6564 CrossRef CAS PubMed.
- O. T. Unke, S. Chmiela, H. E. Sauceda, M. Gastegger, I. Poltavsky, K. T. Schütt, A. Tkatchenko and K.-R. Müller, Chem. Rev., 2021, 121, 10142–10186 CrossRef CAS PubMed.
- S. Manzhos and P. Golub, J. Chem. Phys., 2020, 153, 074104 CrossRef CAS PubMed.
- P. Golub and S. Manzhos, Phys. Chem. Chem. Phys., 2018, 21, 378–395 RSC.
- M. Fujinami, R. Kageyama, J. Seino, Y. Ikabata and H. Nakai, Chem. Phys. Lett., 2020, 748, 137358 CrossRef CAS.
- J. Seino, R. Kageyama, M. Fujinami, Y. Ikabata and H. Nakai, Chem. Phys. Lett., 2019, 734, 136732 CrossRef CAS.
- J. C. Snyder, M. Rupp, K. Hansen, L. Blooston, K.-R. Müller and K. Burke, J. Chem. Phys., 2013, 139, 224104 CrossRef PubMed.
- K. Yao and J. Parkhill, J. Chem. Theory Comput., 2016, 12, 1139–1147 CrossRef CAS PubMed.
- S. Manzhos, Mach. Learn.: Sci. Technol., 2020, 1, 013002 Search PubMed.
- G. Montavon, G. B. Orr and K.-R. Mueller, Neural Networks: Tricks of the Trade, Springer, Berlin Heidelberg, 2nd edn, 2012 Search PubMed.
- C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine Learning, MIT Press, Cambridge MA, USA, 2006 Search PubMed.
- C. M. Bishop, Pattern Recognition and Machine Learning, Springer, Singapore, 2006 Search PubMed.
- A. Kamath, R. A. Vargas-Hernández, R. V. Krems, T. Carrington and S. Manzhos, J. Chem. Phys., 2018, 148, 241702 CrossRef PubMed.
- R. M. Neal, PhD Thesis, University of Toronto, 1995.
- D. L. Donoho, AMS Conference on Math Challenges of the 21st Century, AMS, 2000.
- G. Li, J. Hu, S.-W. Wang, P. G. Georgopoulos, J. Schoendorf and H. Rabitz, J. Phys. Chem. A, 2006, 110, 2474–2485 CrossRef CAS PubMed.
- H. Rabitz and Ö. F. Aliş, J. Math. Chem., 1999, 25, 197–233 CrossRef CAS.
- Ö. F. Alış and H. Rabitz, J. Math. Chem., 2001, 29, 127–142 CrossRef.
- G. Li, S.-W. Wang and H. Rabitz, J. Phys. Chem. A, 2002, 106, 8721–8733 CrossRef CAS.
- A. N. Gorban, Appl. Math. Lett., 1998, 11, 45–49 CrossRef.
- K. Hornik, Neural Networks, 1991, 4, 251–257 CrossRef.
- K. Hornik, M. Stinchcombe and H. White, Neural Networks, 1990, 3, 551–560 CrossRef.
- V. Kůrková, Neural Networks, 1992, 5, 501–506 CrossRef.
- M. G. Genton, J. Mach. Learn. Res., 2001, 2, 299–312 Search PubMed.
- I. J. Myung, J. Math. Psychol., 2003, 47, 90–100 CrossRef.
- J. Bergstra and Y. Bengio, J. Mach. Learn. Res., 2012, 13, 281–305 Search PubMed.
- E. Brochu, V. M. Cora and N. de Freitas, arXiv, 2010, preprint, arXiv:1012.2599 [cs] DOI:10.48550/arXiv.1012.2599.
- J. Snoek, H. Larochelle and R. P. Adams, Advances in Neural Information Processing Systems, ed. F. Pereira, C. J. C. Burges, L. Bottou and K. Q. Weinberger, Curran Associates, Inc., vol. 25, 2012 Search PubMed.
- M. Fischetti and M. Stringher, arXiv, 2019, preprint, arXiv.1906.01504 [cs, math, stat] DOI:10.48550/arXiv.1906.01504.
- H. Alibrahim and S. A. Ludwig, in 2021 IEEE Congress on Evolutionary Computation (CEC), 2021, pp. 1551–1559.
- M. A. Boussaidi, O. Ren, D. Voytsekhovsky and S. Manzhos, J. Phys. Chem. A, 2020, 124, 7598–7607 CrossRef CAS PubMed.
- D. Duvenaud, H. Nickisch and C. E. Rasmussen, Advances in Neural Information Processing Systems, 2011, pp. 226–234 Search PubMed.
- S. Manzhos and M. Ihara, J. Math. Chem., 2022 DOI:10.1007/s10910-022-01407-x.
- S. Bubeck and M. Sellke, Advances in Neural Information Processing Systems, Curran Associates, Inc., 2021, vol. 34, pp. 28811–28822 Search PubMed.
- G.-B. Huang, Q.-Y. Zhu and C.-K. Siew, Neurocomputing, 2006, 70, 489–501 CrossRef.
- Y. Liao, S.-C. Fang and H. L. W. Nuttle, Neural Networks, 2003, 16, 1019–1028 CrossRef PubMed.
- W. Wu, D. Nan, J. Long and Y. Ma, Neural Networks, 2008, 21, 1464–1465 CrossRef PubMed.
- M. H. Beck, A. Jäckle, G. A. Worth and H.-D. Meyer, Phys. Rep., 2000, 324, 1–105 CrossRef CAS.
- S. Manzhos and T. Carrington, J. Chem. Phys., 2006, 125, 194105 CrossRef PubMed.
- M. Schmitt, Neural Comput., 2002, 14, 241–301 CrossRef PubMed.
- W. Koch and D. H. Zhang, J. Chem. Phys., 2014, 141, 021101 CrossRef PubMed.
- A. Brown and E. Pradhan, J. Theor. Comput. Chem., 2017, 16, 1730001 CrossRef CAS.
- E. Pradhan and A. Brown, J. Chem. Phys., 2016, 144, 174305 CrossRef PubMed.
- E. Pradhan and A. Brown, J. Mol. Spectrosc., 2016, 330, 158–164 CrossRef CAS.
- S. Manzhos and M. Ihara, arXiv, 2022, preprint, arXiv.2112.02467 [cs, math] DOI:10.48550/arXiv.2112.02467.
- V. L. Deringer, A. P. Bartók, N. Bernstein, D. M. Wilkins, M. Ceriotti and G. Csányi, Chem. Rev., 2021, 121, 10073–10141 CrossRef CAS PubMed.
- S. Manzhos, K. Yamashita and T. Carrington, Chem. Phys. Lett., 2011, 511, 434–439 CrossRef CAS.
- M. Chan, S. Manzhos, T. Carrington and K. Yamashita, J. Chem. Theory Comput., 2012, 8, 2053–2061 CrossRef CAS PubMed.
- S. Manzhos, T. J. Carrington and K. Yamashita, J. Phys. Chem. Lett., 2011, 2, 2193–2199 CrossRef CAS.
- R. Penrose, Math. Proc. Cambridge Philos. Soc., 1955, 51, 406–413 CrossRef.
- S. Manzhos, E. Sasaki and M. Ihara, Mach. Learn.: Sci. Technol., 2022, 3, 01LT02 Search PubMed.
- O. Ren, M. A. Boussaidi, D. Voytsekhovsky, M. Ihara and S. Manzhos, Comput. Phys. Commun., 2021, 108220 Search PubMed.
- S. Manzhos, X. Wang, R. Dawes and T. Carrington, J. Phys. Chem. A, 2006, 110, 5295–5304 CrossRef CAS PubMed.
- W. C. Witt, B. G. del Rio, J. M. Dieterich and E. A. Carter, J. Mater. Res., 2018, 33, 777–795 CrossRef CAS.
- E. Sasaki, M.Eng. thesis, Tokyo Institute of Techology, 2022.
- S. Carter, S. J. Culik and J. M. Bowman, J. Chem. Phys., 1997, 107, 10458–10469 CrossRef CAS.
- S. Manzhos and T. Carrington, J. Chem. Phys., 2006, 125, 084109 CrossRef PubMed.
- S. Manzhos, K. Yamashita and T. Carrington, Comput. Phys. Commun., 2009, 180, 2002–2012 CrossRef CAS.
- S. Manzhos, K. Yamashita and T. Carrington, in Coping with Complexity: Model Reduction and Data Analysis, ed. A. N. Gorban and D. Roose, Springer, Berlin, Heidelberg, 2011, pp. 133–149 Search PubMed.
- C. H. Hodges, Can. J. Phys., 1973, 51, 1428–1437 CrossRef.
- R. J. Bartlett and D. S. Ranasinghe, Chem. Phys. Lett., 2017, 669, 54–70 CrossRef CAS.
- S. Manzhos and M. Ihara, Phys. Chem. Chem. Phys., 2022, 24, 15158–15172 RSC.
- T. K. Roy and R. B. Gerber, Phys. Chem. Chem. Phys., 2013, 15, 9468–9492 RSC.
- D. Shemesh, J. Mullin, M. S. Gordon and R. B. Gerber, Chem. Phys., 2008, 347, 218–228 CrossRef CAS.
- P. Carbonnière, A. Dargelos and C. Pouchan, Theor. Chem. Acc., 2010, 125, 543–554 Search PubMed.
- A. Erba, J. Maul, M. Ferrabone, P. Carbonnière, M. Rérat and R. Dovesi, J. Chem. Theory Comput., 2019, 15, 3755–3765 CrossRef CAS PubMed.
- A. Erba, J. Maul, M. Ferrabone, R. Dovesi, M. Rérat and P. Carbonnière, J. Chem. Theory Comput., 2019, 15, 3766–3777 CrossRef CAS PubMed.
- H. Kulik, T. Hammerschmidt, J. Schmidt, S. Botti, M. A. L. Marques, M. Boley, M. Scheffler, M. Todorović, P. Rinke, C. Oses, A. Smolyanyuk, S. Curtarolo, A. Tkatchenko, A. Bartok, S. Manzhos, M. Ihara, T. Carrington, J. Behler, O. Isayev, M. Veit, A. Grisafi, J. Nigam, M. Ceriotti, K. T. Schütt, J. Westermayr, M. Gastegger, R. Maurer, B. Kalita, K. Burke, R. Nagai, R. Akashi, O. Sugino, J. Hermann, F. Noé, S. Pilati, C. Draxl, M. Kuban, S. Rigamonti, M. Scheidgen, M. Esters, D. Hicks, C. Toher, P. Balachandran, I. Tamblyn, S. Whitelam, C. Bellinger and L. M. Ghiringhelli, Electron. Struct., 2022, 4, 0230004 Search PubMed.

This journal is © the Owner Societies 2023 |