Mixture Modeling, Sparse Covariance Estimation and Parallel Computing in Bayesian Analysis

<p>Mixture modeling of continuous data is an extremely effective and popular method for density estimation and clustering. However as the size of the data grows, both in terms of dimension and number of observations, many modeling and computational problems arise. In the Bayesian setting, comp...

Full description

Bibliographic Details
Main Author: Cron, Andrew
Other Authors: West, Mike
Published: 2012
Subjects:
Online Access:http://hdl.handle.net/10161/6151
Description
Summary:<p>Mixture modeling of continuous data is an extremely effective and popular method for density estimation and clustering. However as the size of the data grows, both in terms of dimension and number of observations, many modeling and computational problems arise. In the Bayesian setting, computational methods for posterior inference become intractable as the number of observations and/or possible clusters gets large. Furthermore, relabeling in sampling methods is increasingly difficult to address as the data gets large. This thesis addresses computational and methodolog- ical solutions to these problems by utilizing modern computational hardware and new methodology. Novel approaches for parsimonious covariance modeling and information sharing across multiple data sets are then built upon these computational improvements.</p><p>Chapter 1 introduces the fundamental modeling approaches in mixture modeling including Dirichlet processes and posterior inference using Gibbs sampling. Chapter 2 describes the utilization of graphical processing units for massive gains in computational performance in both mixture models and general Bayesian modeling. Chapter 3 introduces a new relabeling approach in mixture modeling that can be scaled far beyond current methodology to massive data and high dimensional settings. Chapter 4 generalizes chapters 2 and 3 to the hierarchical Dirichlet process setting to "borrow strength" from multiple studies in classification problems in flow cytometry. Chapter 5 develops a novel approach for sparse covariance estimation using sparse, full rank, orthogonal matrix estimation. These new methods are applied to a mixture modeling with measurement error setting for classification. Finally, Chapter 6 summarizes the work given in this thesis and outlines exciting areas for future research.</p> === Dissertation