Tree-Based Deep Mixture of Experts with Applications to Visual Saliency Prediction and Quality Robust Visual Recognition

abstract: Mixture of experts is a machine learning ensemble approach that consists of individual models that are trained to be ``experts'' on subsets of the data, and a gating network that provides weights to output a combination of the expert predictions. Mixture of experts models do not...

Full description

Bibliographic Details
Other Authors:	Dodge, Samuel Fuller (Author)
Format:	Doctoral Thesis
Language:	English
Published:	2018
Subjects:	Artificial intelligence computer vision deep learning human studies visual saliency
Online Access:	http://hdl.handle.net/2286/R.I.50588

id	ndltd-asu.edu-item-50588
record_format	oai_dc
spelling	ndltd-asu.edu-item-505882018-10-02T03:01:14Z Tree-Based Deep Mixture of Experts with Applications to Visual Saliency Prediction and Quality Robust Visual Recognition abstract: Mixture of experts is a machine learning ensemble approach that consists of individual models that are trained to be ``experts'' on subsets of the data, and a gating network that provides weights to output a combination of the expert predictions. Mixture of experts models do not currently see wide use due to difficulty in training diverse experts and high computational requirements. This work presents modifications of the mixture of experts formulation that use domain knowledge to improve training, and incorporate parameter sharing among experts to reduce computational requirements. First, this work presents an application of mixture of experts models for quality robust visual recognition. First it is shown that human subjects outperform deep neural networks on classification of distorted images, and then propose a model, MixQualNet, that is more robust to distortions. The proposed model consists of ``experts'' that are trained on a particular type of image distortion. The final output of the model is a weighted sum of the expert models, where the weights are determined by a separate gating network. The proposed model also incorporates weight sharing to reduce the number of parameters, as well as increase performance. Second, an application of mixture of experts to predict visual saliency is presented. A computational saliency model attempts to predict where humans will look in an image. In the proposed model, each expert network is trained to predict saliency for a set of closely related images. The final saliency map is computed as a weighted mixture of the expert networks' outputs, with weights determined by a separate gating network. The proposed model achieves better performance than several other visual saliency models and a baseline non-mixture model. Finally, this work introduces a saliency model that is a weighted mixture of models trained for different levels of saliency. Levels of saliency include high saliency, which corresponds to regions where almost all subjects look, and low saliency, which corresponds to regions where some, but not all subjects look. The weighted mixture shows improved performance compared with baseline models because of the diversity of the individual model predictions. Dissertation/Thesis Dodge, Samuel Fuller (Author) Karam, Lina (Advisor) Jayasuriya, Suren (Committee member) Li, Baoxin (Committee member) Turaga, Pavan (Committee member) Arizona State University (Publisher) Artificial intelligence computer vision deep learning human studies visual saliency eng 150 pages Doctoral Dissertation Electrical Engineering 2018 Doctoral Dissertation http://hdl.handle.net/2286/R.I.50588 http://rightsstatements.org/vocab/InC/1.0/ 2018
collection	NDLTD
language	English
format	Doctoral Thesis
sources	NDLTD
topic	Artificial intelligence computer vision deep learning human studies visual saliency
spellingShingle	Artificial intelligence computer vision deep learning human studies visual saliency Tree-Based Deep Mixture of Experts with Applications to Visual Saliency Prediction and Quality Robust Visual Recognition
description	abstract: Mixture of experts is a machine learning ensemble approach that consists of individual models that are trained to be ``experts'' on subsets of the data, and a gating network that provides weights to output a combination of the expert predictions. Mixture of experts models do not currently see wide use due to difficulty in training diverse experts and high computational requirements. This work presents modifications of the mixture of experts formulation that use domain knowledge to improve training, and incorporate parameter sharing among experts to reduce computational requirements. First, this work presents an application of mixture of experts models for quality robust visual recognition. First it is shown that human subjects outperform deep neural networks on classification of distorted images, and then propose a model, MixQualNet, that is more robust to distortions. The proposed model consists of ``experts'' that are trained on a particular type of image distortion. The final output of the model is a weighted sum of the expert models, where the weights are determined by a separate gating network. The proposed model also incorporates weight sharing to reduce the number of parameters, as well as increase performance. Second, an application of mixture of experts to predict visual saliency is presented. A computational saliency model attempts to predict where humans will look in an image. In the proposed model, each expert network is trained to predict saliency for a set of closely related images. The final saliency map is computed as a weighted mixture of the expert networks' outputs, with weights determined by a separate gating network. The proposed model achieves better performance than several other visual saliency models and a baseline non-mixture model. Finally, this work introduces a saliency model that is a weighted mixture of models trained for different levels of saliency. Levels of saliency include high saliency, which corresponds to regions where almost all subjects look, and low saliency, which corresponds to regions where some, but not all subjects look. The weighted mixture shows improved performance compared with baseline models because of the diversity of the individual model predictions. === Dissertation/Thesis === Doctoral Dissertation Electrical Engineering 2018
author2	Dodge, Samuel Fuller (Author)
author_facet	Dodge, Samuel Fuller (Author)
title	Tree-Based Deep Mixture of Experts with Applications to Visual Saliency Prediction and Quality Robust Visual Recognition
title_short	Tree-Based Deep Mixture of Experts with Applications to Visual Saliency Prediction and Quality Robust Visual Recognition
title_full	Tree-Based Deep Mixture of Experts with Applications to Visual Saliency Prediction and Quality Robust Visual Recognition
title_fullStr	Tree-Based Deep Mixture of Experts with Applications to Visual Saliency Prediction and Quality Robust Visual Recognition
title_full_unstemmed	Tree-Based Deep Mixture of Experts with Applications to Visual Saliency Prediction and Quality Robust Visual Recognition
title_sort	tree-based deep mixture of experts with applications to visual saliency prediction and quality robust visual recognition
publishDate	2018
url	http://hdl.handle.net/2286/R.I.50588
_version_	1718757045073608704

Tree-Based Deep Mixture of Experts with Applications to Visual Saliency Prediction and Quality Robust Visual Recognition

Similar Items