Modelling hospital outcome: problems with endogeneity

Abstract Background Mortality modelling in the critical care paradigm traditionally uses logistic regression, despite the availability of estimators commonly used in alternate disciplines. Little attention has been paid to covariate endogeneity and the status of non-randomized treatment assignment....

Full description

Bibliographic Details
Main Authors: John L. Moran, John D. Santamaria, Graeme J. Duke, The Australian & New Zealand Intensive Care Society (ANZICS) Centre for Outcomes & Resource Evaluation (CORE)
Format: Article
Language:English
Published: BMC 2021-06-01
Series:BMC Medical Research Methodology
Subjects:
Online Access:https://doi.org/10.1186/s12874-021-01251-8
id doaj-790f510b7790457fac5230ea5ea18eb7
record_format Article
spelling doaj-790f510b7790457fac5230ea5ea18eb72021-06-27T11:03:09ZengBMCBMC Medical Research Methodology1471-22882021-06-0121111910.1186/s12874-021-01251-8Modelling hospital outcome: problems with endogeneityJohn L. Moran0John D. Santamaria1Graeme J. Duke2The Australian & New Zealand Intensive Care Society (ANZICS) Centre for Outcomes & Resource Evaluation (CORE)Department of Intensive Care Medicine, The Queen Elizabeth HospitalDepartment of Critical Care Medicine, St Vincent’s Hospital (Melbourne)Intensive Services, Eastern HealthAbstract Background Mortality modelling in the critical care paradigm traditionally uses logistic regression, despite the availability of estimators commonly used in alternate disciplines. Little attention has been paid to covariate endogeneity and the status of non-randomized treatment assignment. Using a large registry database, various binary outcome modelling strategies and methods to account for covariate endogeneity were explored. Methods Patient mortality data was sourced from the Australian & New Zealand Intensive Society Adult Patient Database for 2016. Hospital mortality was modelled using logistic, probit and linear probability (LPM) models with intensive care (ICU) providers as fixed (FE) and random (RE) effects. Model comparison entailed indices of discrimination and calibration, information criteria (AIC and BIC) and binned residual analysis. Suspect covariate and ventilation treatment assignment endogeneity was identified by correlation between predictor variable and hospital mortality error terms, using the Stata™ “eprobit” estimator. Marginal effects were used to demonstrate effect estimate differences between probit and “eprobit” models. Results The cohort comprised 92,693 patients from 124 intensive care units (ICU) in calendar year 2016. Patients mean age was 61.8 (SD 17.5) years, 41.6% were female and APACHE III severity of illness score 54.5(25.6); 43.7% were ventilated. Of the models considered in predicting hospital mortality, logistic regression (with or without ICU FE) and RE logistic regression dominated, more so the latter using information criteria indices. The LPM suffered from many predictions outside the unit [0,1] interval and both poor discrimination and calibration. Error terms of hospital length of stay, an independent risk of death score and ventilation status were correlated with the mortality error term. Marked differences in the ventilation mortality marginal effect was demonstrated between the probit and the "eprobit" models which were scenario dependent. Endogeneity was not demonstrated for the APACHE III score. Conclusions Logistic regression accounting for provider effects was the preferred estimator for hospital mortality modelling. Endogeneity of covariates and treatment variables may be identified using appropriate modelling, but failure to do so yields problematic effect estimates.https://doi.org/10.1186/s12874-021-01251-8Outcome analysisLogitProbitLinear probability modelCalibrationEndogeneity
collection DOAJ
language English
format Article
sources DOAJ
author John L. Moran
John D. Santamaria
Graeme J. Duke
The Australian & New Zealand Intensive Care Society (ANZICS) Centre for Outcomes & Resource Evaluation (CORE)
spellingShingle John L. Moran
John D. Santamaria
Graeme J. Duke
The Australian & New Zealand Intensive Care Society (ANZICS) Centre for Outcomes & Resource Evaluation (CORE)
Modelling hospital outcome: problems with endogeneity
BMC Medical Research Methodology
Outcome analysis
Logit
Probit
Linear probability model
Calibration
Endogeneity
author_facet John L. Moran
John D. Santamaria
Graeme J. Duke
The Australian & New Zealand Intensive Care Society (ANZICS) Centre for Outcomes & Resource Evaluation (CORE)
author_sort John L. Moran
title Modelling hospital outcome: problems with endogeneity
title_short Modelling hospital outcome: problems with endogeneity
title_full Modelling hospital outcome: problems with endogeneity
title_fullStr Modelling hospital outcome: problems with endogeneity
title_full_unstemmed Modelling hospital outcome: problems with endogeneity
title_sort modelling hospital outcome: problems with endogeneity
publisher BMC
series BMC Medical Research Methodology
issn 1471-2288
publishDate 2021-06-01
description Abstract Background Mortality modelling in the critical care paradigm traditionally uses logistic regression, despite the availability of estimators commonly used in alternate disciplines. Little attention has been paid to covariate endogeneity and the status of non-randomized treatment assignment. Using a large registry database, various binary outcome modelling strategies and methods to account for covariate endogeneity were explored. Methods Patient mortality data was sourced from the Australian & New Zealand Intensive Society Adult Patient Database for 2016. Hospital mortality was modelled using logistic, probit and linear probability (LPM) models with intensive care (ICU) providers as fixed (FE) and random (RE) effects. Model comparison entailed indices of discrimination and calibration, information criteria (AIC and BIC) and binned residual analysis. Suspect covariate and ventilation treatment assignment endogeneity was identified by correlation between predictor variable and hospital mortality error terms, using the Stata™ “eprobit” estimator. Marginal effects were used to demonstrate effect estimate differences between probit and “eprobit” models. Results The cohort comprised 92,693 patients from 124 intensive care units (ICU) in calendar year 2016. Patients mean age was 61.8 (SD 17.5) years, 41.6% were female and APACHE III severity of illness score 54.5(25.6); 43.7% were ventilated. Of the models considered in predicting hospital mortality, logistic regression (with or without ICU FE) and RE logistic regression dominated, more so the latter using information criteria indices. The LPM suffered from many predictions outside the unit [0,1] interval and both poor discrimination and calibration. Error terms of hospital length of stay, an independent risk of death score and ventilation status were correlated with the mortality error term. Marked differences in the ventilation mortality marginal effect was demonstrated between the probit and the "eprobit" models which were scenario dependent. Endogeneity was not demonstrated for the APACHE III score. Conclusions Logistic regression accounting for provider effects was the preferred estimator for hospital mortality modelling. Endogeneity of covariates and treatment variables may be identified using appropriate modelling, but failure to do so yields problematic effect estimates.
topic Outcome analysis
Logit
Probit
Linear probability model
Calibration
Endogeneity
url https://doi.org/10.1186/s12874-021-01251-8
work_keys_str_mv AT johnlmoran modellinghospitaloutcomeproblemswithendogeneity
AT johndsantamaria modellinghospitaloutcomeproblemswithendogeneity
AT graemejduke modellinghospitaloutcomeproblemswithendogeneity
AT theaustraliannewzealandintensivecaresocietyanzicscentreforoutcomesresourceevaluationcore modellinghospitaloutcomeproblemswithendogeneity
_version_ 1721358269897768960