IPDfromKM: reconstruct individual patient data from published Kaplan-Meier survival curves

Abstract Background When applying secondary analysis on published survival data, it is critical to obtain each patient’s raw data, because the individual patient data (IPD) approach has been considered as the gold standard of data analysis. However, researchers often lack access to IPD. We aim to pr...

Full description

Bibliographic Details
Main Authors: Na Liu, Yanhong Zhou, J. Jack Lee
Format: Article
Language:English
Published: BMC 2021-06-01
Series:BMC Medical Research Methodology
Subjects:
Online Access:https://doi.org/10.1186/s12874-021-01308-8
id doaj-aed4e8ec588b41f4b8686379281ea484
record_format Article
spelling doaj-aed4e8ec588b41f4b8686379281ea4842021-06-06T11:03:07ZengBMCBMC Medical Research Methodology1471-22882021-06-0121112210.1186/s12874-021-01308-8IPDfromKM: reconstruct individual patient data from published Kaplan-Meier survival curvesNa Liu0Yanhong Zhou1J. Jack Lee2Department of Biostatistics, The University of Texas, MD Anderson Cancer CenterDepartment of Biostatistics, The University of Texas, MD Anderson Cancer CenterDepartment of Biostatistics, The University of Texas, MD Anderson Cancer CenterAbstract Background When applying secondary analysis on published survival data, it is critical to obtain each patient’s raw data, because the individual patient data (IPD) approach has been considered as the gold standard of data analysis. However, researchers often lack access to IPD. We aim to propose a straightforward and robust approach to obtain IPD from published survival curves with a user-friendly software platform. Results Improving upon existing methods, we propose an easy-to-use, two-stage approach to reconstruct IPD from published Kaplan-Meier (K-M) curves. Stage 1 extracts raw data coordinates and Stage 2 reconstructs IPD using the proposed method. To facilitate the use of the proposed method, we developed the R package IPDfromKM and an accompanying web-based Shiny application. Both the R package and Shiny application have an “all-in-one” feature such that users can use them to extract raw data coordinates from published K-M curves, reconstruct IPD from the extracted data coordinates, visualize the reconstructed IPD, assess the accuracy of the reconstruction, and perform secondary analysis on the basis of the reconstructed IPD. We illustrate the use of the R package and the Shiny application with K-M curves from published studies. Extensive simulations and real-world data applications demonstrate that the proposed method has high accuracy and great reliability in estimating the number of events, number of patients at risk, survival probabilities, median survival times, and hazard ratios. Conclusions IPDfromKM has great flexibility and accuracy to reconstruct IPD from published K-M curves with different shapes. We believe that the R package and the Shiny application will greatly facilitate the potential use of quality IPD and advance the use of secondary data to facilitate informed decision making in medical research.https://doi.org/10.1186/s12874-021-01308-8Individual patient data (IPD)Kaplan-Meier curveMeta-analysisR packageShiny applicationSurvival analysis
collection DOAJ
language English
format Article
sources DOAJ
author Na Liu
Yanhong Zhou
J. Jack Lee
spellingShingle Na Liu
Yanhong Zhou
J. Jack Lee
IPDfromKM: reconstruct individual patient data from published Kaplan-Meier survival curves
BMC Medical Research Methodology
Individual patient data (IPD)
Kaplan-Meier curve
Meta-analysis
R package
Shiny application
Survival analysis
author_facet Na Liu
Yanhong Zhou
J. Jack Lee
author_sort Na Liu
title IPDfromKM: reconstruct individual patient data from published Kaplan-Meier survival curves
title_short IPDfromKM: reconstruct individual patient data from published Kaplan-Meier survival curves
title_full IPDfromKM: reconstruct individual patient data from published Kaplan-Meier survival curves
title_fullStr IPDfromKM: reconstruct individual patient data from published Kaplan-Meier survival curves
title_full_unstemmed IPDfromKM: reconstruct individual patient data from published Kaplan-Meier survival curves
title_sort ipdfromkm: reconstruct individual patient data from published kaplan-meier survival curves
publisher BMC
series BMC Medical Research Methodology
issn 1471-2288
publishDate 2021-06-01
description Abstract Background When applying secondary analysis on published survival data, it is critical to obtain each patient’s raw data, because the individual patient data (IPD) approach has been considered as the gold standard of data analysis. However, researchers often lack access to IPD. We aim to propose a straightforward and robust approach to obtain IPD from published survival curves with a user-friendly software platform. Results Improving upon existing methods, we propose an easy-to-use, two-stage approach to reconstruct IPD from published Kaplan-Meier (K-M) curves. Stage 1 extracts raw data coordinates and Stage 2 reconstructs IPD using the proposed method. To facilitate the use of the proposed method, we developed the R package IPDfromKM and an accompanying web-based Shiny application. Both the R package and Shiny application have an “all-in-one” feature such that users can use them to extract raw data coordinates from published K-M curves, reconstruct IPD from the extracted data coordinates, visualize the reconstructed IPD, assess the accuracy of the reconstruction, and perform secondary analysis on the basis of the reconstructed IPD. We illustrate the use of the R package and the Shiny application with K-M curves from published studies. Extensive simulations and real-world data applications demonstrate that the proposed method has high accuracy and great reliability in estimating the number of events, number of patients at risk, survival probabilities, median survival times, and hazard ratios. Conclusions IPDfromKM has great flexibility and accuracy to reconstruct IPD from published K-M curves with different shapes. We believe that the R package and the Shiny application will greatly facilitate the potential use of quality IPD and advance the use of secondary data to facilitate informed decision making in medical research.
topic Individual patient data (IPD)
Kaplan-Meier curve
Meta-analysis
R package
Shiny application
Survival analysis
url https://doi.org/10.1186/s12874-021-01308-8
work_keys_str_mv AT naliu ipdfromkmreconstructindividualpatientdatafrompublishedkaplanmeiersurvivalcurves
AT yanhongzhou ipdfromkmreconstructindividualpatientdatafrompublishedkaplanmeiersurvivalcurves
AT jjacklee ipdfromkmreconstructindividualpatientdatafrompublishedkaplanmeiersurvivalcurves
_version_ 1721394440005746688