Tree-Based Deep Mixture of Experts with Applications to Visual Saliency Prediction and Quality Robust Visual Recognition

abstract: Mixture of experts is a machine learning ensemble approach that consists of individual models that are trained to be ``experts'' on subsets of the data, and a gating network that provides weights to output a combination of the expert predictions. Mixture of experts models do not...

Full description

Bibliographic Details
Other Authors: Dodge, Samuel Fuller (Author)
Format: Doctoral Thesis
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/2286/R.I.50588
id ndltd-asu.edu-item-50588
record_format oai_dc
spelling ndltd-asu.edu-item-505882018-10-02T03:01:14Z Tree-Based Deep Mixture of Experts with Applications to Visual Saliency Prediction and Quality Robust Visual Recognition abstract: Mixture of experts is a machine learning ensemble approach that consists of individual models that are trained to be ``experts'' on subsets of the data, and a gating network that provides weights to output a combination of the expert predictions. Mixture of experts models do not currently see wide use due to difficulty in training diverse experts and high computational requirements. This work presents modifications of the mixture of experts formulation that use domain knowledge to improve training, and incorporate parameter sharing among experts to reduce computational requirements. First, this work presents an application of mixture of experts models for quality robust visual recognition. First it is shown that human subjects outperform deep neural networks on classification of distorted images, and then propose a model, MixQualNet, that is more robust to distortions. The proposed model consists of ``experts'' that are trained on a particular type of image distortion. The final output of the model is a weighted sum of the expert models, where the weights are determined by a separate gating network. The proposed model also incorporates weight sharing to reduce the number of parameters, as well as increase performance. Second, an application of mixture of experts to predict visual saliency is presented. A computational saliency model attempts to predict where humans will look in an image. In the proposed model, each expert network is trained to predict saliency for a set of closely related images. The final saliency map is computed as a weighted mixture of the expert networks' outputs, with weights determined by a separate gating network. The proposed model achieves better performance than several other visual saliency models and a baseline non-mixture model. Finally, this work introduces a saliency model that is a weighted mixture of models trained for different levels of saliency. Levels of saliency include high saliency, which corresponds to regions where almost all subjects look, and low saliency, which corresponds to regions where some, but not all subjects look. The weighted mixture shows improved performance compared with baseline models because of the diversity of the individual model predictions. Dissertation/Thesis Dodge, Samuel Fuller (Author) Karam, Lina (Advisor) Jayasuriya, Suren (Committee member) Li, Baoxin (Committee member) Turaga, Pavan (Committee member) Arizona State University (Publisher) Artificial intelligence computer vision deep learning human studies visual saliency eng 150 pages Doctoral Dissertation Electrical Engineering 2018 Doctoral Dissertation http://hdl.handle.net/2286/R.I.50588 http://rightsstatements.org/vocab/InC/1.0/ 2018
collection NDLTD
language English
format Doctoral Thesis
sources NDLTD
topic Artificial intelligence
computer vision
deep learning
human studies
visual saliency
spellingShingle Artificial intelligence
computer vision
deep learning
human studies
visual saliency
Tree-Based Deep Mixture of Experts with Applications to Visual Saliency Prediction and Quality Robust Visual Recognition
description abstract: Mixture of experts is a machine learning ensemble approach that consists of individual models that are trained to be ``experts'' on subsets of the data, and a gating network that provides weights to output a combination of the expert predictions. Mixture of experts models do not currently see wide use due to difficulty in training diverse experts and high computational requirements. This work presents modifications of the mixture of experts formulation that use domain knowledge to improve training, and incorporate parameter sharing among experts to reduce computational requirements. First, this work presents an application of mixture of experts models for quality robust visual recognition. First it is shown that human subjects outperform deep neural networks on classification of distorted images, and then propose a model, MixQualNet, that is more robust to distortions. The proposed model consists of ``experts'' that are trained on a particular type of image distortion. The final output of the model is a weighted sum of the expert models, where the weights are determined by a separate gating network. The proposed model also incorporates weight sharing to reduce the number of parameters, as well as increase performance. Second, an application of mixture of experts to predict visual saliency is presented. A computational saliency model attempts to predict where humans will look in an image. In the proposed model, each expert network is trained to predict saliency for a set of closely related images. The final saliency map is computed as a weighted mixture of the expert networks' outputs, with weights determined by a separate gating network. The proposed model achieves better performance than several other visual saliency models and a baseline non-mixture model. Finally, this work introduces a saliency model that is a weighted mixture of models trained for different levels of saliency. Levels of saliency include high saliency, which corresponds to regions where almost all subjects look, and low saliency, which corresponds to regions where some, but not all subjects look. The weighted mixture shows improved performance compared with baseline models because of the diversity of the individual model predictions. === Dissertation/Thesis === Doctoral Dissertation Electrical Engineering 2018
author2 Dodge, Samuel Fuller (Author)
author_facet Dodge, Samuel Fuller (Author)
title Tree-Based Deep Mixture of Experts with Applications to Visual Saliency Prediction and Quality Robust Visual Recognition
title_short Tree-Based Deep Mixture of Experts with Applications to Visual Saliency Prediction and Quality Robust Visual Recognition
title_full Tree-Based Deep Mixture of Experts with Applications to Visual Saliency Prediction and Quality Robust Visual Recognition
title_fullStr Tree-Based Deep Mixture of Experts with Applications to Visual Saliency Prediction and Quality Robust Visual Recognition
title_full_unstemmed Tree-Based Deep Mixture of Experts with Applications to Visual Saliency Prediction and Quality Robust Visual Recognition
title_sort tree-based deep mixture of experts with applications to visual saliency prediction and quality robust visual recognition
publishDate 2018
url http://hdl.handle.net/2286/R.I.50588
_version_ 1718757045073608704