Reclassification calibration test for censored survival data: performance and comparison to goodness-of-fit criteria
Abstract Background The risk reclassification table assesses clinical performance of a biomarker in terms of movements across relevant risk categories. The Reclassification- Calibration (RC) statistic has been developed for binary outcomes, but its performance for survival data with moderate to high...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2018-07-01
|
Series: | Diagnostic and Prognostic Research |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s41512-018-0034-5 |
id |
doaj-547309e08c30414e9380d40cd9273bea |
---|---|
record_format |
Article |
spelling |
doaj-547309e08c30414e9380d40cd9273bea2020-11-24T23:53:30ZengBMCDiagnostic and Prognostic Research2397-75232018-07-012111210.1186/s41512-018-0034-5Reclassification calibration test for censored survival data: performance and comparison to goodness-of-fit criteriaOlga V. Demler0Nina P. Paynter1Nancy R. Cook2Division of Preventive Medicine, Brigham and Women’s HospitalDivision of Preventive Medicine, Brigham and Women’s HospitalDivision of Preventive Medicine, Brigham and Women’s HospitalAbstract Background The risk reclassification table assesses clinical performance of a biomarker in terms of movements across relevant risk categories. The Reclassification- Calibration (RC) statistic has been developed for binary outcomes, but its performance for survival data with moderate to high censoring rates has not been evaluated. Methods We develop an RC statistic for survival data with higher censoring rates using the Greenwood-Nam-D’Agostino approach (RC-GND). We examine its performance characteristics and compare its performance and utility to the Hosmer-Lemeshow goodness-of-fit test under various assumptions about the censoring rate and the shape of the baseline hazard. Results The RC-GND test was robust to high (up to 50%) censoring rates and did not exceed the targeted 5% Type I error in a variety of simulated scenarios. It achieved 80% power to detect better calibration with respect to clinical categories when an important predictor with a hazard ratio of at least 1.7 to 2.2 was added to the model, while the Hosmer-Lemeshow goodness-of-fit (gof) test had power of 5% in this scenario. Conclusions The RC-GND test should be used to test the improvement in calibration with respect to clinically relevant risk strata. When an important predictor is omitted, the Hosmer-Lemeshow goodness-of-fit test is usually not significant, while the RC-GND test is sensitive to such an omission.http://link.springer.com/article/10.1186/s41512-018-0034-5Risk reclassificationCalibrationGoodness-of-fit testSurvival analysisHosmer-LemeshowGrønnesby-Borgan |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Olga V. Demler Nina P. Paynter Nancy R. Cook |
spellingShingle |
Olga V. Demler Nina P. Paynter Nancy R. Cook Reclassification calibration test for censored survival data: performance and comparison to goodness-of-fit criteria Diagnostic and Prognostic Research Risk reclassification Calibration Goodness-of-fit test Survival analysis Hosmer-Lemeshow Grønnesby-Borgan |
author_facet |
Olga V. Demler Nina P. Paynter Nancy R. Cook |
author_sort |
Olga V. Demler |
title |
Reclassification calibration test for censored survival data: performance and comparison to goodness-of-fit criteria |
title_short |
Reclassification calibration test for censored survival data: performance and comparison to goodness-of-fit criteria |
title_full |
Reclassification calibration test for censored survival data: performance and comparison to goodness-of-fit criteria |
title_fullStr |
Reclassification calibration test for censored survival data: performance and comparison to goodness-of-fit criteria |
title_full_unstemmed |
Reclassification calibration test for censored survival data: performance and comparison to goodness-of-fit criteria |
title_sort |
reclassification calibration test for censored survival data: performance and comparison to goodness-of-fit criteria |
publisher |
BMC |
series |
Diagnostic and Prognostic Research |
issn |
2397-7523 |
publishDate |
2018-07-01 |
description |
Abstract Background The risk reclassification table assesses clinical performance of a biomarker in terms of movements across relevant risk categories. The Reclassification- Calibration (RC) statistic has been developed for binary outcomes, but its performance for survival data with moderate to high censoring rates has not been evaluated. Methods We develop an RC statistic for survival data with higher censoring rates using the Greenwood-Nam-D’Agostino approach (RC-GND). We examine its performance characteristics and compare its performance and utility to the Hosmer-Lemeshow goodness-of-fit test under various assumptions about the censoring rate and the shape of the baseline hazard. Results The RC-GND test was robust to high (up to 50%) censoring rates and did not exceed the targeted 5% Type I error in a variety of simulated scenarios. It achieved 80% power to detect better calibration with respect to clinical categories when an important predictor with a hazard ratio of at least 1.7 to 2.2 was added to the model, while the Hosmer-Lemeshow goodness-of-fit (gof) test had power of 5% in this scenario. Conclusions The RC-GND test should be used to test the improvement in calibration with respect to clinically relevant risk strata. When an important predictor is omitted, the Hosmer-Lemeshow goodness-of-fit test is usually not significant, while the RC-GND test is sensitive to such an omission. |
topic |
Risk reclassification Calibration Goodness-of-fit test Survival analysis Hosmer-Lemeshow Grønnesby-Borgan |
url |
http://link.springer.com/article/10.1186/s41512-018-0034-5 |
work_keys_str_mv |
AT olgavdemler reclassificationcalibrationtestforcensoredsurvivaldataperformanceandcomparisontogoodnessoffitcriteria AT ninappaynter reclassificationcalibrationtestforcensoredsurvivaldataperformanceandcomparisontogoodnessoffitcriteria AT nancyrcook reclassificationcalibrationtestforcensoredsurvivaldataperformanceandcomparisontogoodnessoffitcriteria |
_version_ |
1725469313073152000 |