Rectangularization of Gaussian process regression for optimization of hyperparameters

Gaussian process regression (GPR) is a powerful machine learning method which has recently enjoyed wider use, in particular in physical sciences. In its original formulation, GPR uses a square matrix of covariances among training data and can be viewed as linear regression problem with equal numbers...

Full description

Bibliographic Details
Published in:Machine Learning with Applications
Main Authors: Sergei Manzhos, Manabu Ihara
Format: Article
Language:English
Published: Elsevier 2023-09-01
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666827023000403
Description
Summary:Gaussian process regression (GPR) is a powerful machine learning method which has recently enjoyed wider use, in particular in physical sciences. In its original formulation, GPR uses a square matrix of covariances among training data and can be viewed as linear regression problem with equal numbers of training data and basis functions. When data are sparse, avoidance of overfitting and optimization of hyperparameters of GPR are difficult, in particular in high-dimensional spaces where the data sparsity issue cannot practically be resolved by adding more data. Optimal choice of hyperparameters, however, determines success or failure of the application of the GPR method. We show that parameter optimization is facilitated by rectangularization of the defining equation of GPR. On the example of a 15-dimensional molecular potential energy surface we demonstrate that this approach allows effective hyperparameter tuning even with very sparse data.
ISSN:2666-8270