Zero-Inflated gaussian mixed models for analyzing longitudinal microbiome data.

<h4>Motivation</h4>The human microbiome is variable and dynamic in nature. Longitudinal studies could explain the mechanisms in maintaining the microbiome in health or causing dysbiosis in disease. However, it remains challenging to properly analyze the longitudinal microbiome data from...

Full description

Bibliographic Details
Main Authors: Xinyan Zhang, Boyi Guo, Nengjun Yi
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2020-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0242073
id doaj-9c861cbca72742f5a435963839b5a81e
record_format Article
spelling doaj-9c861cbca72742f5a435963839b5a81e2021-03-04T12:28:41ZengPublic Library of Science (PLoS)PLoS ONE1932-62032020-01-011511e024207310.1371/journal.pone.0242073Zero-Inflated gaussian mixed models for analyzing longitudinal microbiome data.Xinyan ZhangBoyi GuoNengjun Yi<h4>Motivation</h4>The human microbiome is variable and dynamic in nature. Longitudinal studies could explain the mechanisms in maintaining the microbiome in health or causing dysbiosis in disease. However, it remains challenging to properly analyze the longitudinal microbiome data from either 16S rRNA or metagenome shotgun sequencing studies, output as proportions or counts. Most microbiome data are sparse, requiring statistical models to handle zero-inflation. Moreover, longitudinal design induces correlation among the samples and thus further complicates the analysis and interpretation of the microbiome data.<h4>Results</h4>In this article, we propose zero-inflated Gaussian mixed models (ZIGMMs) to analyze longitudinal microbiome data. ZIGMMs is a robust and flexible method which can be applicable for longitudinal microbiome proportion data or count data generated with either 16S rRNA or shotgun sequencing technologies. It can include various types of fixed effects and random effects and account for various within-subject correlation structures, and can effectively handle zero-inflation. We developed an efficient Expectation-Maximization (EM) algorithm to fit the ZIGMMs by taking advantage of the standard procedure for fitting linear mixed models. We demonstrate the computational efficiency of our EM algorithm by comparing with two other zero-inflated methods. We show that ZIGMMs outperform the previously used linear mixed models (LMMs), negative binomial mixed models (NBMMs) and zero-inflated Beta regression mixed model (ZIBR) in detecting associated effects in longitudinal microbiome data through extensive simulations. We also apply our method to two public longitudinal microbiome datasets and compare with LMMs and NBMMs in detecting dynamic effects of associated taxa.https://doi.org/10.1371/journal.pone.0242073
collection DOAJ
language English
format Article
sources DOAJ
author Xinyan Zhang
Boyi Guo
Nengjun Yi
spellingShingle Xinyan Zhang
Boyi Guo
Nengjun Yi
Zero-Inflated gaussian mixed models for analyzing longitudinal microbiome data.
PLoS ONE
author_facet Xinyan Zhang
Boyi Guo
Nengjun Yi
author_sort Xinyan Zhang
title Zero-Inflated gaussian mixed models for analyzing longitudinal microbiome data.
title_short Zero-Inflated gaussian mixed models for analyzing longitudinal microbiome data.
title_full Zero-Inflated gaussian mixed models for analyzing longitudinal microbiome data.
title_fullStr Zero-Inflated gaussian mixed models for analyzing longitudinal microbiome data.
title_full_unstemmed Zero-Inflated gaussian mixed models for analyzing longitudinal microbiome data.
title_sort zero-inflated gaussian mixed models for analyzing longitudinal microbiome data.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2020-01-01
description <h4>Motivation</h4>The human microbiome is variable and dynamic in nature. Longitudinal studies could explain the mechanisms in maintaining the microbiome in health or causing dysbiosis in disease. However, it remains challenging to properly analyze the longitudinal microbiome data from either 16S rRNA or metagenome shotgun sequencing studies, output as proportions or counts. Most microbiome data are sparse, requiring statistical models to handle zero-inflation. Moreover, longitudinal design induces correlation among the samples and thus further complicates the analysis and interpretation of the microbiome data.<h4>Results</h4>In this article, we propose zero-inflated Gaussian mixed models (ZIGMMs) to analyze longitudinal microbiome data. ZIGMMs is a robust and flexible method which can be applicable for longitudinal microbiome proportion data or count data generated with either 16S rRNA or shotgun sequencing technologies. It can include various types of fixed effects and random effects and account for various within-subject correlation structures, and can effectively handle zero-inflation. We developed an efficient Expectation-Maximization (EM) algorithm to fit the ZIGMMs by taking advantage of the standard procedure for fitting linear mixed models. We demonstrate the computational efficiency of our EM algorithm by comparing with two other zero-inflated methods. We show that ZIGMMs outperform the previously used linear mixed models (LMMs), negative binomial mixed models (NBMMs) and zero-inflated Beta regression mixed model (ZIBR) in detecting associated effects in longitudinal microbiome data through extensive simulations. We also apply our method to two public longitudinal microbiome datasets and compare with LMMs and NBMMs in detecting dynamic effects of associated taxa.
url https://doi.org/10.1371/journal.pone.0242073
work_keys_str_mv AT xinyanzhang zeroinflatedgaussianmixedmodelsforanalyzinglongitudinalmicrobiomedata
AT boyiguo zeroinflatedgaussianmixedmodelsforanalyzinglongitudinalmicrobiomedata
AT nengjunyi zeroinflatedgaussianmixedmodelsforanalyzinglongitudinalmicrobiomedata
_version_ 1714802527582552064