LASSO type penalized spline regression for binary data

Abstract Background Generalized linear mixed models (GLMMs), typically used for analyzing correlated data, can also be used for smoothing by considering the knot coefficients from a regression spline as random effects. The resulting models are called semiparametric mixed models (SPMMs). Allowing the...

Full description

Bibliographic Details
Main Authors: Muhammad Abu Shadeque Mullah, James A. Hanley, Andrea Benedetti
Format: Article
Language:English
Published: BMC 2021-04-01
Series:BMC Medical Research Methodology
Subjects:
Online Access:https://doi.org/10.1186/s12874-021-01234-9
id doaj-449b938f9b4e41cba21c0351ed624088
record_format Article
spelling doaj-449b938f9b4e41cba21c0351ed6240882021-04-25T11:03:06ZengBMCBMC Medical Research Methodology1471-22882021-04-0121111410.1186/s12874-021-01234-9LASSO type penalized spline regression for binary dataMuhammad Abu Shadeque Mullah0James A. Hanley1Andrea Benedetti2Department of Epidemiology, Biostatistics and Occupational Health, McGill UniversityDepartment of Epidemiology, Biostatistics and Occupational Health, McGill UniversityDepartment of Epidemiology, Biostatistics and Occupational Health, McGill UniversityAbstract Background Generalized linear mixed models (GLMMs), typically used for analyzing correlated data, can also be used for smoothing by considering the knot coefficients from a regression spline as random effects. The resulting models are called semiparametric mixed models (SPMMs). Allowing the random knot coefficients to follow a normal distribution with mean zero and a constant variance is equivalent to using a penalized spline with a ridge regression type penalty. We introduce the least absolute shrinkage and selection operator (LASSO) type penalty in the SPMM setting by considering the coefficients at the knots to follow a Laplace double exponential distribution with mean zero. Methods We adopt a Bayesian approach and use the Markov Chain Monte Carlo (MCMC) algorithm for model fitting. Through simulations, we compare the performance of curve fitting in a SPMM using a LASSO type penalty to that of using ridge penalty for binary data. We apply the proposed method to obtain smooth curves from data on the relationship between the amount of pack years of smoking and the risk of developing chronic obstructive pulmonary disease (COPD). Results The LASSO penalty performs as well as ridge penalty for simple shapes of association and outperforms the ridge penalty when the shape of association is complex or linear. Conclusion We demonstrated that LASSO penalty captured complex dose-response association better than the Ridge penalty in a SPMM.https://doi.org/10.1186/s12874-021-01234-9Penalized splinesGeneralized linear mixed modelsRidge regressionLeast absolute shrinkage and selection operator (LASSO)Markov chain Monte Carlo
collection DOAJ
language English
format Article
sources DOAJ
author Muhammad Abu Shadeque Mullah
James A. Hanley
Andrea Benedetti
spellingShingle Muhammad Abu Shadeque Mullah
James A. Hanley
Andrea Benedetti
LASSO type penalized spline regression for binary data
BMC Medical Research Methodology
Penalized splines
Generalized linear mixed models
Ridge regression
Least absolute shrinkage and selection operator (LASSO)
Markov chain Monte Carlo
author_facet Muhammad Abu Shadeque Mullah
James A. Hanley
Andrea Benedetti
author_sort Muhammad Abu Shadeque Mullah
title LASSO type penalized spline regression for binary data
title_short LASSO type penalized spline regression for binary data
title_full LASSO type penalized spline regression for binary data
title_fullStr LASSO type penalized spline regression for binary data
title_full_unstemmed LASSO type penalized spline regression for binary data
title_sort lasso type penalized spline regression for binary data
publisher BMC
series BMC Medical Research Methodology
issn 1471-2288
publishDate 2021-04-01
description Abstract Background Generalized linear mixed models (GLMMs), typically used for analyzing correlated data, can also be used for smoothing by considering the knot coefficients from a regression spline as random effects. The resulting models are called semiparametric mixed models (SPMMs). Allowing the random knot coefficients to follow a normal distribution with mean zero and a constant variance is equivalent to using a penalized spline with a ridge regression type penalty. We introduce the least absolute shrinkage and selection operator (LASSO) type penalty in the SPMM setting by considering the coefficients at the knots to follow a Laplace double exponential distribution with mean zero. Methods We adopt a Bayesian approach and use the Markov Chain Monte Carlo (MCMC) algorithm for model fitting. Through simulations, we compare the performance of curve fitting in a SPMM using a LASSO type penalty to that of using ridge penalty for binary data. We apply the proposed method to obtain smooth curves from data on the relationship between the amount of pack years of smoking and the risk of developing chronic obstructive pulmonary disease (COPD). Results The LASSO penalty performs as well as ridge penalty for simple shapes of association and outperforms the ridge penalty when the shape of association is complex or linear. Conclusion We demonstrated that LASSO penalty captured complex dose-response association better than the Ridge penalty in a SPMM.
topic Penalized splines
Generalized linear mixed models
Ridge regression
Least absolute shrinkage and selection operator (LASSO)
Markov chain Monte Carlo
url https://doi.org/10.1186/s12874-021-01234-9
work_keys_str_mv AT muhammadabushadequemullah lassotypepenalizedsplineregressionforbinarydata
AT jamesahanley lassotypepenalizedsplineregressionforbinarydata
AT andreabenedetti lassotypepenalizedsplineregressionforbinarydata
_version_ 1721510098805719040