On-line EM and quasi-Baye or : how I learned to stop worrying and love stochastic approximation

The EM algorithm is one of the most popular statistical learning algorithms. Unfortunately, it is a batch learning method. For large data sets and real-time systems, we need to develop on-line methods. In this thesis, we present a comprehensive study of on-line EM algorithms. We use Bayesian theory...

Full description

Bibliographic Details
Main Author: Bao, Kejie
Language:English
Published: 2009
Online Access:http://hdl.handle.net/2429/14569
Description
Summary:The EM algorithm is one of the most popular statistical learning algorithms. Unfortunately, it is a batch learning method. For large data sets and real-time systems, we need to develop on-line methods. In this thesis, we present a comprehensive study of on-line EM algorithms. We use Bayesian theory to propose a new on-line EM algorithm for multinomial mixtures. Based on this theory, we show that there is a direct connection between the setting of Bayes priors and the so-called learning rates of stochastic approximation algorithms, such as on-line EM and quasi-Bayes . Finally, we present extensive simulations, comparisons and parameter sensitivity studies on both synthetic data and documents with text, images and music.