Revealing Facts and Avoiding Biases: A Review of Several Common Problems in Statistical Analyses of Epidemiological Data

This paper reviews common challenges encountered in statistical analyses of epidemiological data for epidemiologists. We focus on the application of linear regression, multivariate logistic regression, and log-linear modeling to epidemiological data. Specific topics include: a) deletion of outliers,...

Full description

Bibliographic Details
Main Authors: Lihan Yan, Youming Sun, Michael R Boivin, Paul O Kwon, Yuanzhang Li
Format: Article
Language:English
Published: Frontiers Media S.A. 2016-10-01
Series:Frontiers in Public Health
Subjects:
Online Access:http://journal.frontiersin.org/Journal/10.3389/fpubh.2016.00207/full
id doaj-762e8c96ef0949b3b71d45aa94712d68
record_format Article
spelling doaj-762e8c96ef0949b3b71d45aa94712d682020-11-25T00:22:44ZengFrontiers Media S.A.Frontiers in Public Health2296-25652016-10-01410.3389/fpubh.2016.00207209452Revealing Facts and Avoiding Biases: A Review of Several Common Problems in Statistical Analyses of Epidemiological Data Lihan Yan0Youming Sun1Michael R Boivin2Paul O Kwon3Yuanzhang Li4The Food and Drug AdministrationDepartment of Sociology, the Ohio State UniversityPreventive Medicine Branch, Walter Reed Army Institute of ResearchPreventive Medicine Branch, Walter Reed Army Institute of ResearchPreventive Medicine Branch, Walter Reed Army Institute of ResearchThis paper reviews common challenges encountered in statistical analyses of epidemiological data for epidemiologists. We focus on the application of linear regression, multivariate logistic regression, and log-linear modeling to epidemiological data. Specific topics include: a) deletion of outliers, b) heteroscedasticity in linear regression, c) limitations of principal component analysis in dimension reduction, d) hazard ratio vs. odds ratio in a rate comparison analysis, e) log-linear models with multiple response data, and f) ordinal logistic vs. multinomial logistic models. As a general rule, a thorough examination of a model’s assumptions against both current data and prior research should precede its use in estimating effects.http://journal.frontiersin.org/Journal/10.3389/fpubh.2016.00207/fullEpidemiologyPrincipal Component AnalysisregressionreviewheteroscedasticityOdds Ratio
collection DOAJ
language English
format Article
sources DOAJ
author Lihan Yan
Youming Sun
Michael R Boivin
Paul O Kwon
Yuanzhang Li
spellingShingle Lihan Yan
Youming Sun
Michael R Boivin
Paul O Kwon
Yuanzhang Li
Revealing Facts and Avoiding Biases: A Review of Several Common Problems in Statistical Analyses of Epidemiological Data
Frontiers in Public Health
Epidemiology
Principal Component Analysis
regression
review
heteroscedasticity
Odds Ratio
author_facet Lihan Yan
Youming Sun
Michael R Boivin
Paul O Kwon
Yuanzhang Li
author_sort Lihan Yan
title Revealing Facts and Avoiding Biases: A Review of Several Common Problems in Statistical Analyses of Epidemiological Data
title_short Revealing Facts and Avoiding Biases: A Review of Several Common Problems in Statistical Analyses of Epidemiological Data
title_full Revealing Facts and Avoiding Biases: A Review of Several Common Problems in Statistical Analyses of Epidemiological Data
title_fullStr Revealing Facts and Avoiding Biases: A Review of Several Common Problems in Statistical Analyses of Epidemiological Data
title_full_unstemmed Revealing Facts and Avoiding Biases: A Review of Several Common Problems in Statistical Analyses of Epidemiological Data
title_sort revealing facts and avoiding biases: a review of several common problems in statistical analyses of epidemiological data
publisher Frontiers Media S.A.
series Frontiers in Public Health
issn 2296-2565
publishDate 2016-10-01
description This paper reviews common challenges encountered in statistical analyses of epidemiological data for epidemiologists. We focus on the application of linear regression, multivariate logistic regression, and log-linear modeling to epidemiological data. Specific topics include: a) deletion of outliers, b) heteroscedasticity in linear regression, c) limitations of principal component analysis in dimension reduction, d) hazard ratio vs. odds ratio in a rate comparison analysis, e) log-linear models with multiple response data, and f) ordinal logistic vs. multinomial logistic models. As a general rule, a thorough examination of a model’s assumptions against both current data and prior research should precede its use in estimating effects.
topic Epidemiology
Principal Component Analysis
regression
review
heteroscedasticity
Odds Ratio
url http://journal.frontiersin.org/Journal/10.3389/fpubh.2016.00207/full
work_keys_str_mv AT lihanyan revealingfactsandavoidingbiasesareviewofseveralcommonproblemsinstatisticalanalysesofepidemiologicaldata
AT youmingsun revealingfactsandavoidingbiasesareviewofseveralcommonproblemsinstatisticalanalysesofepidemiologicaldata
AT michaelrboivin revealingfactsandavoidingbiasesareviewofseveralcommonproblemsinstatisticalanalysesofepidemiologicaldata
AT paulokwon revealingfactsandavoidingbiasesareviewofseveralcommonproblemsinstatisticalanalysesofepidemiologicaldata
AT yuanzhangli revealingfactsandavoidingbiasesareviewofseveralcommonproblemsinstatisticalanalysesofepidemiologicaldata
_version_ 1725358520816107520