Statistical Inference in Missing Data by MCMC and Non-MCMC Multiple Imputation Algorithms: Assessing the Effects of Between-Imputation Iterations

Incomplete data are ubiquitous in social sciences; as a consequence, available data are inefficient (ineffective) and often biased. In the literature, multiple imputation is known to be the standard method to handle missing data. While the theory of multiple imputation has been known for decades, th...

Full description

Bibliographic Details
Main Author: Masayoshi Takahashi
Format: Article
Language:English
Published: Ubiquity Press 2017-07-01
Series:Data Science Journal
Subjects:
Online Access:https://datascience.codata.org/articles/690
id doaj-dc705c3de8d943edaf46145f43cb0390
record_format Article
spelling doaj-dc705c3de8d943edaf46145f43cb03902020-11-24T23:14:58ZengUbiquity PressData Science Journal1683-14702017-07-011610.5334/dsj-2017-037643Statistical Inference in Missing Data by MCMC and Non-MCMC Multiple Imputation Algorithms: Assessing the Effects of Between-Imputation IterationsMasayoshi Takahashi0IR Office, Tokyo University of Foreign Studies, TokyoIncomplete data are ubiquitous in social sciences; as a consequence, available data are inefficient (ineffective) and often biased. In the literature, multiple imputation is known to be the standard method to handle missing data. While the theory of multiple imputation has been known for decades, the implementation is difficult due to the complicated nature of random draws from the posterior distribution. Thus, there are several computational algorithms in software: Data Augmentation (DA), Fully Conditional Specification (FCS), and Expectation-Maximization with Bootstrapping (EMB). Although the literature is full of comparisons between joint modeling (DA, EMB) and conditional modeling (FCS), little is known about the relative superiority between the MCMC algorithms (DA, FCS) and the non-MCMC algorithm (EMB), where MCMC stands for Markov chain Monte Carlo. Based on simulation experiments, the current study contends that EMB is a confidence proper (confidence-supporting) multiple imputation algorithm without between-imputation iterations; thus, EMB is more user-friendly than DA and FCS.https://datascience.codata.org/articles/690MCMCMarkov chain Monte CarloIncomplete dataNonresponseJoint modelingConditional modeling
collection DOAJ
language English
format Article
sources DOAJ
author Masayoshi Takahashi
spellingShingle Masayoshi Takahashi
Statistical Inference in Missing Data by MCMC and Non-MCMC Multiple Imputation Algorithms: Assessing the Effects of Between-Imputation Iterations
Data Science Journal
MCMC
Markov chain Monte Carlo
Incomplete data
Nonresponse
Joint modeling
Conditional modeling
author_facet Masayoshi Takahashi
author_sort Masayoshi Takahashi
title Statistical Inference in Missing Data by MCMC and Non-MCMC Multiple Imputation Algorithms: Assessing the Effects of Between-Imputation Iterations
title_short Statistical Inference in Missing Data by MCMC and Non-MCMC Multiple Imputation Algorithms: Assessing the Effects of Between-Imputation Iterations
title_full Statistical Inference in Missing Data by MCMC and Non-MCMC Multiple Imputation Algorithms: Assessing the Effects of Between-Imputation Iterations
title_fullStr Statistical Inference in Missing Data by MCMC and Non-MCMC Multiple Imputation Algorithms: Assessing the Effects of Between-Imputation Iterations
title_full_unstemmed Statistical Inference in Missing Data by MCMC and Non-MCMC Multiple Imputation Algorithms: Assessing the Effects of Between-Imputation Iterations
title_sort statistical inference in missing data by mcmc and non-mcmc multiple imputation algorithms: assessing the effects of between-imputation iterations
publisher Ubiquity Press
series Data Science Journal
issn 1683-1470
publishDate 2017-07-01
description Incomplete data are ubiquitous in social sciences; as a consequence, available data are inefficient (ineffective) and often biased. In the literature, multiple imputation is known to be the standard method to handle missing data. While the theory of multiple imputation has been known for decades, the implementation is difficult due to the complicated nature of random draws from the posterior distribution. Thus, there are several computational algorithms in software: Data Augmentation (DA), Fully Conditional Specification (FCS), and Expectation-Maximization with Bootstrapping (EMB). Although the literature is full of comparisons between joint modeling (DA, EMB) and conditional modeling (FCS), little is known about the relative superiority between the MCMC algorithms (DA, FCS) and the non-MCMC algorithm (EMB), where MCMC stands for Markov chain Monte Carlo. Based on simulation experiments, the current study contends that EMB is a confidence proper (confidence-supporting) multiple imputation algorithm without between-imputation iterations; thus, EMB is more user-friendly than DA and FCS.
topic MCMC
Markov chain Monte Carlo
Incomplete data
Nonresponse
Joint modeling
Conditional modeling
url https://datascience.codata.org/articles/690
work_keys_str_mv AT masayoshitakahashi statisticalinferenceinmissingdatabymcmcandnonmcmcmultipleimputationalgorithmsassessingtheeffectsofbetweenimputationiterations
_version_ 1725592559628058624