Multiple imputation by predictive mean matching in cluster-randomized trials

Abstract Background Random effects regression imputation has been recommended for multiple imputation (MI) in cluster randomized trials (CRTs) because it is congenial to analyses that use random effects regression. This method relies heavily on model assumptions and may not be robust to misspecifica...

Full description

Bibliographic Details
Main Authors: Brittney E. Bailey, Rebecca Andridge, Abigail B. Shoben
Format: Article
Language:English
Published: BMC 2020-03-01
Series:BMC Medical Research Methodology
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12874-020-00948-6
id doaj-8ee224f1e78247c5895181dd7c6cfe32
record_format Article
spelling doaj-8ee224f1e78247c5895181dd7c6cfe322020-11-25T02:06:17ZengBMCBMC Medical Research Methodology1471-22882020-03-0120111610.1186/s12874-020-00948-6Multiple imputation by predictive mean matching in cluster-randomized trialsBrittney E. Bailey0Rebecca Andridge1Abigail B. Shoben2Department of Mathematics and Statistics, Amherst CollegeCollege of Public Health, The Ohio State UniversityCollege of Public Health, The Ohio State UniversityAbstract Background Random effects regression imputation has been recommended for multiple imputation (MI) in cluster randomized trials (CRTs) because it is congenial to analyses that use random effects regression. This method relies heavily on model assumptions and may not be robust to misspecification of the imputation model. MI by predictive mean matching (PMM) is a semiparametric alternative, but current software for multilevel data relies on imputation models that ignore clustering or use fixed effects for clusters. When used directly for imputation, these two models result in underestimation (ignoring clustering) or overestimation (fixed effects for clusters) of variance estimates. Methods We develop MI procedures based on PMM that leverage these opposing estimated biases in the variance estimates in one of three ways: weighting the distance metric (PMM-dist), weighting the average of the final imputed values from two PMM procedures (PMM-avg), or performing a weighted draw from the final imputed values from the two PMM procedures (PMM-draw). We use Monte-Carlo simulations to evaluate our newly proposed methods relative to established MI procedures, focusing on estimation of treatment group means and their variances after MI. Results The proposed PMM procedures reduce the bias in the MI variance estimator relative to established methods when the imputation model is correctly specified, and are generally more robust to model misspecification than even the random effects imputation methods. Conclusions The PMM-draw procedure in particular is a promising method for multiply imputing missing data from CRTs that can be readily implemented in existing statistical software.http://link.springer.com/article/10.1186/s12874-020-00948-6Missing dataCluster-randomized trialPredictive mean matchingMultiple imputation
collection DOAJ
language English
format Article
sources DOAJ
author Brittney E. Bailey
Rebecca Andridge
Abigail B. Shoben
spellingShingle Brittney E. Bailey
Rebecca Andridge
Abigail B. Shoben
Multiple imputation by predictive mean matching in cluster-randomized trials
BMC Medical Research Methodology
Missing data
Cluster-randomized trial
Predictive mean matching
Multiple imputation
author_facet Brittney E. Bailey
Rebecca Andridge
Abigail B. Shoben
author_sort Brittney E. Bailey
title Multiple imputation by predictive mean matching in cluster-randomized trials
title_short Multiple imputation by predictive mean matching in cluster-randomized trials
title_full Multiple imputation by predictive mean matching in cluster-randomized trials
title_fullStr Multiple imputation by predictive mean matching in cluster-randomized trials
title_full_unstemmed Multiple imputation by predictive mean matching in cluster-randomized trials
title_sort multiple imputation by predictive mean matching in cluster-randomized trials
publisher BMC
series BMC Medical Research Methodology
issn 1471-2288
publishDate 2020-03-01
description Abstract Background Random effects regression imputation has been recommended for multiple imputation (MI) in cluster randomized trials (CRTs) because it is congenial to analyses that use random effects regression. This method relies heavily on model assumptions and may not be robust to misspecification of the imputation model. MI by predictive mean matching (PMM) is a semiparametric alternative, but current software for multilevel data relies on imputation models that ignore clustering or use fixed effects for clusters. When used directly for imputation, these two models result in underestimation (ignoring clustering) or overestimation (fixed effects for clusters) of variance estimates. Methods We develop MI procedures based on PMM that leverage these opposing estimated biases in the variance estimates in one of three ways: weighting the distance metric (PMM-dist), weighting the average of the final imputed values from two PMM procedures (PMM-avg), or performing a weighted draw from the final imputed values from the two PMM procedures (PMM-draw). We use Monte-Carlo simulations to evaluate our newly proposed methods relative to established MI procedures, focusing on estimation of treatment group means and their variances after MI. Results The proposed PMM procedures reduce the bias in the MI variance estimator relative to established methods when the imputation model is correctly specified, and are generally more robust to model misspecification than even the random effects imputation methods. Conclusions The PMM-draw procedure in particular is a promising method for multiply imputing missing data from CRTs that can be readily implemented in existing statistical software.
topic Missing data
Cluster-randomized trial
Predictive mean matching
Multiple imputation
url http://link.springer.com/article/10.1186/s12874-020-00948-6
work_keys_str_mv AT brittneyebailey multipleimputationbypredictivemeanmatchinginclusterrandomizedtrials
AT rebeccaandridge multipleimputationbypredictivemeanmatchinginclusterrandomizedtrials
AT abigailbshoben multipleimputationbypredictivemeanmatchinginclusterrandomizedtrials
_version_ 1724934706270568448