Statistical learning techniques applied to epidemiology: a simulated case-control comparison study with logistic regression

<p>Abstract</p> <p>Background</p> <p>When investigating covariate interactions and group associations with standard regression analyses, the relationship between the response variable and exposure may be difficult to characterize. When the relationship is nonlinear, lin...

Full description

Bibliographic Details
Main Authors: Land Walker H, Heine John J, Egan Kathleen M
Format: Article
Language:English
Published: BMC 2011-01-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/12/37
Description
Summary:<p>Abstract</p> <p>Background</p> <p>When investigating covariate interactions and group associations with standard regression analyses, the relationship between the response variable and exposure may be difficult to characterize. When the relationship is nonlinear, linear modeling techniques do not capture the nonlinear information content. Statistical learning (SL) techniques with kernels are capable of addressing nonlinear problems without making parametric assumptions. However, these techniques do not produce findings relevant for epidemiologic interpretations. A simulated case-control study was used to contrast the information embedding characteristics and separation boundaries produced by a specific SL technique with logistic regression (LR) modeling representing a parametric approach. The SL technique was comprised of a kernel mapping in combination with a perceptron neural network. Because the LR model has an important epidemiologic interpretation, the SL method was modified to produce the analogous interpretation and generate odds ratios for comparison.</p> <p>Results</p> <p>The SL approach is capable of generating odds ratios for main effects and risk factor interactions that better capture nonlinear relationships between exposure variables and outcome in comparison with LR.</p> <p>Conclusions</p> <p>The integration of SL methods in epidemiology may improve both the understanding and interpretation of complex exposure/disease relationships.</p>
ISSN:1471-2105