Exploring Latent Structure in Data: Algorithms and Implementations

abstract: Feature representations for raw data is one of the most important component in a machine learning system. Traditionally, features are \textit{hand crafted} by domain experts which can often be a time consuming process. Furthermore, they do not generalize well to unseen data and novel tasks...

Full description

Bibliographic Details
Other Authors: Sattigeri, Prasanna (Author)
Format: Doctoral Thesis
Language:English
Published: 2014
Subjects:
GPU
Online Access:http://hdl.handle.net/2286/R.I.27464
id ndltd-asu.edu-item-27464
record_format oai_dc
spelling ndltd-asu.edu-item-274642018-06-22T03:05:43Z Exploring Latent Structure in Data: Algorithms and Implementations abstract: Feature representations for raw data is one of the most important component in a machine learning system. Traditionally, features are \textit{hand crafted} by domain experts which can often be a time consuming process. Furthermore, they do not generalize well to unseen data and novel tasks. Recently, there have been many efforts to generate data-driven representations using clustering and sparse models. This dissertation focuses on building data-driven unsupervised models for analyzing raw data and developing efficient feature representations. Simultaneous segmentation and feature extraction approaches for silicon-pores sensor data are considered. Aggregating data into a matrix and performing low rank and sparse matrix decompositions with additional smoothness constraints are proposed to solve this problem. Comparison of several variants of the approaches and results for signal de-noising and translocation/trapping event extraction are presented. Algorithms to improve transform-domain features for ion-channel time-series signals based on matrix completion are presented. The improved features achieve better performance in classification tasks and in reducing the false alarm rates when applied to analyte detection. Developing representations for multimedia is an important and challenging problem with applications ranging from scene recognition, multi-media retrieval and personal life-logging systems to field robot navigation. In this dissertation, we present a new framework for feature extraction for challenging natural environment sounds. Proposed features outperform traditional spectral features on challenging environmental sound datasets. Several algorithms are proposed that perform supervised tasks such as recognition and tag annotation. Ensemble methods are proposed to improve the tag annotation process. To facilitate the use of large datasets, fast implementations are developed for sparse coding, the key component in our algorithms. Several strategies to speed-up Orthogonal Matching Pursuit algorithm using CUDA kernel on a GPU are proposed. Implementations are also developed for a large scale image retrieval system. Image-based "exact search" and "visually similar search" using the image patch sparse codes are performed. Results demonstrate large speed-up over CPU implementations and good retrieval performance is also achieved. Dissertation/Thesis Sattigeri, Prasanna (Author) Spanias, Andreas (Advisor) Thornton, Trevor (Committee member) Goryll, Michael (Committee member) Tsakalis, Konstantinos (Committee member) Arizona State University (Publisher) Electrical engineering Computer engineering Artificial intelligence Feature Learning GPU Machine Learning Retrieval Sparse Coding eng 147 pages Doctoral Dissertation Electrical Engineering 2014 Doctoral Dissertation http://hdl.handle.net/2286/R.I.27464 http://rightsstatements.org/vocab/InC/1.0/ All Rights Reserved 2014
collection NDLTD
language English
format Doctoral Thesis
sources NDLTD
topic Electrical engineering
Computer engineering
Artificial intelligence
Feature Learning
GPU
Machine Learning
Retrieval
Sparse Coding
spellingShingle Electrical engineering
Computer engineering
Artificial intelligence
Feature Learning
GPU
Machine Learning
Retrieval
Sparse Coding
Exploring Latent Structure in Data: Algorithms and Implementations
description abstract: Feature representations for raw data is one of the most important component in a machine learning system. Traditionally, features are \textit{hand crafted} by domain experts which can often be a time consuming process. Furthermore, they do not generalize well to unseen data and novel tasks. Recently, there have been many efforts to generate data-driven representations using clustering and sparse models. This dissertation focuses on building data-driven unsupervised models for analyzing raw data and developing efficient feature representations. Simultaneous segmentation and feature extraction approaches for silicon-pores sensor data are considered. Aggregating data into a matrix and performing low rank and sparse matrix decompositions with additional smoothness constraints are proposed to solve this problem. Comparison of several variants of the approaches and results for signal de-noising and translocation/trapping event extraction are presented. Algorithms to improve transform-domain features for ion-channel time-series signals based on matrix completion are presented. The improved features achieve better performance in classification tasks and in reducing the false alarm rates when applied to analyte detection. Developing representations for multimedia is an important and challenging problem with applications ranging from scene recognition, multi-media retrieval and personal life-logging systems to field robot navigation. In this dissertation, we present a new framework for feature extraction for challenging natural environment sounds. Proposed features outperform traditional spectral features on challenging environmental sound datasets. Several algorithms are proposed that perform supervised tasks such as recognition and tag annotation. Ensemble methods are proposed to improve the tag annotation process. To facilitate the use of large datasets, fast implementations are developed for sparse coding, the key component in our algorithms. Several strategies to speed-up Orthogonal Matching Pursuit algorithm using CUDA kernel on a GPU are proposed. Implementations are also developed for a large scale image retrieval system. Image-based "exact search" and "visually similar search" using the image patch sparse codes are performed. Results demonstrate large speed-up over CPU implementations and good retrieval performance is also achieved. === Dissertation/Thesis === Doctoral Dissertation Electrical Engineering 2014
author2 Sattigeri, Prasanna (Author)
author_facet Sattigeri, Prasanna (Author)
title Exploring Latent Structure in Data: Algorithms and Implementations
title_short Exploring Latent Structure in Data: Algorithms and Implementations
title_full Exploring Latent Structure in Data: Algorithms and Implementations
title_fullStr Exploring Latent Structure in Data: Algorithms and Implementations
title_full_unstemmed Exploring Latent Structure in Data: Algorithms and Implementations
title_sort exploring latent structure in data: algorithms and implementations
publishDate 2014
url http://hdl.handle.net/2286/R.I.27464
_version_ 1718700614359187456