Bayesian Alignment Model for Analysis of LC-MS-based Omic Data

Liquid chromatography coupled with mass spectrometry (LC-MS) has been widely used in various omic studies for biomarker discovery. Appropriate LC-MS data preprocessing steps are needed to detect true differences between biological groups. Retention time alignment is one of the most important yet cha...

Full description

Bibliographic Details
Main Author: Tsai, Tsung-Heng
Other Authors: Electrical and Computer Engineering
Format: Others
Published: Virginia Tech 2015
Subjects:
Online Access:http://hdl.handle.net/10919/64151
id ndltd-VTETD-oai-vtechworks.lib.vt.edu-10919-64151
record_format oai_dc
spelling ndltd-VTETD-oai-vtechworks.lib.vt.edu-10919-641512020-09-29T05:34:49Z Bayesian Alignment Model for Analysis of LC-MS-based Omic Data Tsai, Tsung-Heng Electrical and Computer Engineering Wang, Yue J. Yu, Guoqiang Mun, Seong Ki Silva, Luiz A. Ressom, Habtom W. Xuan, Jianhua alignment Bayesian inference biomarker discovery liquid chromatography-mass spectrometry (LC-MS) Markov chain Monte Carlo (MCMC) Liquid chromatography coupled with mass spectrometry (LC-MS) has been widely used in various omic studies for biomarker discovery. Appropriate LC-MS data preprocessing steps are needed to detect true differences between biological groups. Retention time alignment is one of the most important yet challenging preprocessing steps, in order to ensure that ion intensity measurements among multiple LC-MS runs are comparable. In this dissertation, we propose a Bayesian alignment model (BAM) for analysis of LC-MS data. BAM uses Markov chain Monte Carlo (MCMC) methods to draw inference on the model parameters and provides estimates of the retention time variability along with uncertainty measures, enabling a natural framework to integrate information of various sources. From methodology development to practical application, we investigate the alignment problem through three research topics: 1) development of single-profile Bayesian alignment model, 2) development of multi-profile Bayesian alignment model, and 3) application to biomarker discovery research. Chapter 2 introduces the profile-based Bayesian alignment using a single chromatogram, e.g., base peak chromatogram from each LC-MS run. The single-profile alignment model improves on existing MCMC-based alignment methods through 1) the implementation of an efficient MCMC sampler using a block Metropolis-Hastings algorithm, and 2) an adaptive mechanism for knot specification using stochastic search variable selection (SSVS). Chapter 3 extends the model to integrate complementary information that better captures the variability in chromatographic separation. We use Gaussian process regression on the internal standards to derive a prior distribution for the mapping functions. In addition, a clustering approach is proposed to identify multiple representative chromatograms for each LC-MS run. With the Gaussian process prior, these chromatograms are simultaneously considered in the profile-based alignment, which greatly improves the model estimation and facilitates the subsequent peak matching process. Chapter 4 demonstrates the applicability of the proposed Bayesian alignment model to biomarker discovery research. We integrate the proposed Bayesian alignment model into a rigorous preprocessing pipeline for LC-MS data analysis. Through the developed analysis pipeline, candidate biomarkers for hepatocellular carcinoma (HCC) are identified and confirmed on a complementary platform. Ph. D. 2015-11-14T07:00:40Z 2015-11-14T07:00:40Z 2014-05-22 Dissertation vt_gsexam:2790 http://hdl.handle.net/10919/64151 In Copyright http://rightsstatements.org/vocab/InC/1.0/ ETD application/pdf application/pdf Virginia Tech
collection NDLTD
format Others
sources NDLTD
topic alignment
Bayesian inference
biomarker discovery
liquid chromatography-mass spectrometry (LC-MS)
Markov chain Monte Carlo (MCMC)
spellingShingle alignment
Bayesian inference
biomarker discovery
liquid chromatography-mass spectrometry (LC-MS)
Markov chain Monte Carlo (MCMC)
Tsai, Tsung-Heng
Bayesian Alignment Model for Analysis of LC-MS-based Omic Data
description Liquid chromatography coupled with mass spectrometry (LC-MS) has been widely used in various omic studies for biomarker discovery. Appropriate LC-MS data preprocessing steps are needed to detect true differences between biological groups. Retention time alignment is one of the most important yet challenging preprocessing steps, in order to ensure that ion intensity measurements among multiple LC-MS runs are comparable. In this dissertation, we propose a Bayesian alignment model (BAM) for analysis of LC-MS data. BAM uses Markov chain Monte Carlo (MCMC) methods to draw inference on the model parameters and provides estimates of the retention time variability along with uncertainty measures, enabling a natural framework to integrate information of various sources. From methodology development to practical application, we investigate the alignment problem through three research topics: 1) development of single-profile Bayesian alignment model, 2) development of multi-profile Bayesian alignment model, and 3) application to biomarker discovery research. Chapter 2 introduces the profile-based Bayesian alignment using a single chromatogram, e.g., base peak chromatogram from each LC-MS run. The single-profile alignment model improves on existing MCMC-based alignment methods through 1) the implementation of an efficient MCMC sampler using a block Metropolis-Hastings algorithm, and 2) an adaptive mechanism for knot specification using stochastic search variable selection (SSVS). Chapter 3 extends the model to integrate complementary information that better captures the variability in chromatographic separation. We use Gaussian process regression on the internal standards to derive a prior distribution for the mapping functions. In addition, a clustering approach is proposed to identify multiple representative chromatograms for each LC-MS run. With the Gaussian process prior, these chromatograms are simultaneously considered in the profile-based alignment, which greatly improves the model estimation and facilitates the subsequent peak matching process. Chapter 4 demonstrates the applicability of the proposed Bayesian alignment model to biomarker discovery research. We integrate the proposed Bayesian alignment model into a rigorous preprocessing pipeline for LC-MS data analysis. Through the developed analysis pipeline, candidate biomarkers for hepatocellular carcinoma (HCC) are identified and confirmed on a complementary platform. === Ph. D.
author2 Electrical and Computer Engineering
author_facet Electrical and Computer Engineering
Tsai, Tsung-Heng
author Tsai, Tsung-Heng
author_sort Tsai, Tsung-Heng
title Bayesian Alignment Model for Analysis of LC-MS-based Omic Data
title_short Bayesian Alignment Model for Analysis of LC-MS-based Omic Data
title_full Bayesian Alignment Model for Analysis of LC-MS-based Omic Data
title_fullStr Bayesian Alignment Model for Analysis of LC-MS-based Omic Data
title_full_unstemmed Bayesian Alignment Model for Analysis of LC-MS-based Omic Data
title_sort bayesian alignment model for analysis of lc-ms-based omic data
publisher Virginia Tech
publishDate 2015
url http://hdl.handle.net/10919/64151
work_keys_str_mv AT tsaitsungheng bayesianalignmentmodelforanalysisoflcmsbasedomicdata
_version_ 1719343795445694464