Flexible and efficient Gaussian process models for machine learning

Gaussian process (GP) models are widely used to perform Bayesian nonlinear regression and classification tasks that are central to many machine learning problems. A GP is nonparametric, meaning that the complexity of the model grows as more data points are received. Another attractive feature is the...

Full description

Bibliographic Details
Main Author: Snelson, Edward Lloyd
Published: University College London (University of London) 2007
Subjects:
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.445905
id ndltd-bl.uk-oai-ethos.bl.uk-445905
record_format oai_dc
spelling ndltd-bl.uk-oai-ethos.bl.uk-4459052017-10-04T03:13:29ZFlexible and efficient Gaussian process models for machine learningSnelson, Edward Lloyd2007Gaussian process (GP) models are widely used to perform Bayesian nonlinear regression and classification tasks that are central to many machine learning problems. A GP is nonparametric, meaning that the complexity of the model grows as more data points are received. Another attractive feature is the behaviour of the error bars. They naturally grow in regions away from training data where we have high uncertainty about the interpolating function. In their standard form GPs have several limitations, which can be divided into two broad categories: computational difficulties for large data sets, and restrictive modelling assumptions for complex data sets. This thesis addresses various aspects of both of these problems. The training cost for a GP has 0(N3) complexity, where N is the number of training data points. This is due to an inversion of the N x N covariance matrix. In this thesis we develop several new techniques to reduce this complexity to 0(NM2), where M is a user chosen number much smaller than N. The sparse approximation we use is based on a set of M 'pseudo-inputs' which are optimised together with hyperparameters at training time. We develop a further approximation based on clustering inputs that can be seen as a mixture of local and global approximations. Standard GPs assume a uniform noise variance. We use our sparse approximation described above as a way of relaxing this assumption. By making a modification of the sparse covariance function, we can model input dependent noise. To handle high dimensional data sets we use supervised linear dimensionality reduction. As another extension of the standard GP, we relax the Gaussianity assumption of the process by learning a nonlinear transformation of the output space. All these techniques further increase the applicability of GPs to real complex data sets. We present empirical comparisons of our algorithms with various competing techniques, and suggest problem dependent strategies to follow in practice.006.31University College London (University of London)http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.445905http://discovery.ucl.ac.uk/1445855/Electronic Thesis or Dissertation
collection NDLTD
sources NDLTD
topic 006.31
spellingShingle 006.31
Snelson, Edward Lloyd
Flexible and efficient Gaussian process models for machine learning
description Gaussian process (GP) models are widely used to perform Bayesian nonlinear regression and classification tasks that are central to many machine learning problems. A GP is nonparametric, meaning that the complexity of the model grows as more data points are received. Another attractive feature is the behaviour of the error bars. They naturally grow in regions away from training data where we have high uncertainty about the interpolating function. In their standard form GPs have several limitations, which can be divided into two broad categories: computational difficulties for large data sets, and restrictive modelling assumptions for complex data sets. This thesis addresses various aspects of both of these problems. The training cost for a GP has 0(N3) complexity, where N is the number of training data points. This is due to an inversion of the N x N covariance matrix. In this thesis we develop several new techniques to reduce this complexity to 0(NM2), where M is a user chosen number much smaller than N. The sparse approximation we use is based on a set of M 'pseudo-inputs' which are optimised together with hyperparameters at training time. We develop a further approximation based on clustering inputs that can be seen as a mixture of local and global approximations. Standard GPs assume a uniform noise variance. We use our sparse approximation described above as a way of relaxing this assumption. By making a modification of the sparse covariance function, we can model input dependent noise. To handle high dimensional data sets we use supervised linear dimensionality reduction. As another extension of the standard GP, we relax the Gaussianity assumption of the process by learning a nonlinear transformation of the output space. All these techniques further increase the applicability of GPs to real complex data sets. We present empirical comparisons of our algorithms with various competing techniques, and suggest problem dependent strategies to follow in practice.
author Snelson, Edward Lloyd
author_facet Snelson, Edward Lloyd
author_sort Snelson, Edward Lloyd
title Flexible and efficient Gaussian process models for machine learning
title_short Flexible and efficient Gaussian process models for machine learning
title_full Flexible and efficient Gaussian process models for machine learning
title_fullStr Flexible and efficient Gaussian process models for machine learning
title_full_unstemmed Flexible and efficient Gaussian process models for machine learning
title_sort flexible and efficient gaussian process models for machine learning
publisher University College London (University of London)
publishDate 2007
url http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.445905
work_keys_str_mv AT snelsonedwardlloyd flexibleandefficientgaussianprocessmodelsformachinelearning
_version_ 1718542597491785728