Flexible and efficient Gaussian process models for machine learning

Gaussian process (GP) models are widely used to perform Bayesian nonlinear regression and classification tasks that are central to many machine learning problems. A GP is nonparametric, meaning that the complexity of the model grows as more data points are received. Another attractive feature is the...

Full description

Bibliographic Details
Main Author:	Snelson, Edward Lloyd
Published:	University College London (University of London) 2007
Subjects:	006.31
Online Access:	http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.445905

id	ndltd-bl.uk-oai-ethos.bl.uk-445905
record_format	oai_dc
spelling	ndltd-bl.uk-oai-ethos.bl.uk-4459052017-10-04T03:13:29ZFlexible and efficient Gaussian process models for machine learningSnelson, Edward Lloyd2007Gaussian process (GP) models are widely used to perform Bayesian nonlinear regression and classification tasks that are central to many machine learning problems. A GP is nonparametric, meaning that the complexity of the model grows as more data points are received. Another attractive feature is the behaviour of the error bars. They naturally grow in regions away from training data where we have high uncertainty about the interpolating function. In their standard form GPs have several limitations, which can be divided into two broad categories: computational difficulties for large data sets, and restrictive modelling assumptions for complex data sets. This thesis addresses various aspects of both of these problems. The training cost for a GP has 0(N3) complexity, where N is the number of training data points. This is due to an inversion of the N x N covariance matrix. In this thesis we develop several new techniques to reduce this complexity to 0(NM2), where M is a user chosen number much smaller than N. The sparse approximation we use is based on a set of M 'pseudo-inputs' which are optimised together with hyperparameters at training time. We develop a further approximation based on clustering inputs that can be seen as a mixture of local and global approximations. Standard GPs assume a uniform noise variance. We use our sparse approximation described above as a way of relaxing this assumption. By making a modification of the sparse covariance function, we can model input dependent noise. To handle high dimensional data sets we use supervised linear dimensionality reduction. As another extension of the standard GP, we relax the Gaussianity assumption of the process by learning a nonlinear transformation of the output space. All these techniques further increase the applicability of GPs to real complex data sets. We present empirical comparisons of our algorithms with various competing techniques, and suggest problem dependent strategies to follow in practice.006.31University College London (University of London)http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.445905http://discovery.ucl.ac.uk/1445855/Electronic Thesis or Dissertation
collection	NDLTD
sources	NDLTD
topic	006.31
spellingShingle	006.31 Snelson, Edward Lloyd Flexible and efficient Gaussian process models for machine learning
description	Gaussian process (GP) models are widely used to perform Bayesian nonlinear regression and classification tasks that are central to many machine learning problems. A GP is nonparametric, meaning that the complexity of the model grows as more data points are received. Another attractive feature is the behaviour of the error bars. They naturally grow in regions away from training data where we have high uncertainty about the interpolating function. In their standard form GPs have several limitations, which can be divided into two broad categories: computational difficulties for large data sets, and restrictive modelling assumptions for complex data sets. This thesis addresses various aspects of both of these problems. The training cost for a GP has 0(N3) complexity, where N is the number of training data points. This is due to an inversion of the N x N covariance matrix. In this thesis we develop several new techniques to reduce this complexity to 0(NM2), where M is a user chosen number much smaller than N. The sparse approximation we use is based on a set of M 'pseudo-inputs' which are optimised together with hyperparameters at training time. We develop a further approximation based on clustering inputs that can be seen as a mixture of local and global approximations. Standard GPs assume a uniform noise variance. We use our sparse approximation described above as a way of relaxing this assumption. By making a modification of the sparse covariance function, we can model input dependent noise. To handle high dimensional data sets we use supervised linear dimensionality reduction. As another extension of the standard GP, we relax the Gaussianity assumption of the process by learning a nonlinear transformation of the output space. All these techniques further increase the applicability of GPs to real complex data sets. We present empirical comparisons of our algorithms with various competing techniques, and suggest problem dependent strategies to follow in practice.
author	Snelson, Edward Lloyd
author_facet	Snelson, Edward Lloyd
author_sort	Snelson, Edward Lloyd
title	Flexible and efficient Gaussian process models for machine learning
title_short	Flexible and efficient Gaussian process models for machine learning
title_full	Flexible and efficient Gaussian process models for machine learning
title_fullStr	Flexible and efficient Gaussian process models for machine learning
title_full_unstemmed	Flexible and efficient Gaussian process models for machine learning
title_sort	flexible and efficient gaussian process models for machine learning
publisher	University College London (University of London)
publishDate	2007
url	http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.445905
work_keys_str_mv	AT snelsonedwardlloyd flexibleandefficientgaussianprocessmodelsformachinelearning
_version_	1718542597491785728

Flexible and efficient Gaussian process models for machine learning

Similar Items