Summary: | Abstract Background Bladder cancer outcomes have not changed significantly in 30 years; the BladderPath trial (Image Directed Redesign of Bladder Cancer Treatment Pathway, ISRCTN35296862) proposes to evaluate a modified pathway for diagnosis and treatment ensuring appropriate pathways are undertaken earlier to improve outcomes. We are piloting a novel data collection technique based on routine National Health Service (NHS) data, with no traditional patient-Health Care Professional contact after recruitment, where trial data are traditionally collected on case report forms. Data will be collected from routine administrative sources and validated via data queries to sites. We report here the feasibility and pre-trial methodological development and validation of the schema proposed for BladderPath. Methods Locally treated patient cohorts were utilised for routine data validation (hospital interactions data (HID) and administrative radiotherapy department data (RTD)). Single site events of interest were algorithmically extracted from the 2008–2018 HID and validated against reference datasets to determine detection sensitivity. Survival analysis was performed using RTD and HID data. Hazard ratios and survival statistics were calculated estimating treatment effects and further validating and assessing the scope of routine data. Results Overall, 829/1042 (sensitivity 0.80) events of interest were identified in the HID, with varying levels of sensitivity; identifying, 202/206 (sensitivity 0.98; PPV 0.96) surgical events but only 391/568 (sensitivity 0.69; PPV 0.95) radiotherapy regimens. An overall temporal quality improvement trend was present: detecting 41/117 events (35%) in 2011 to 104/109 (95%) in 2017 (all event types). Using the RTD, 5-year survival rates were 43% (95% CI 25–59%) in the chemoradiotherapy group and 30% (95% CI 23–36%) in the radiotherapy group; using the HID, the 5-year radical cystectomy survival rate was 57% (95% CI 50–63%). Conclusions Routine data are a feasible method for trial data collection. As long as events of interest are pre-validated, very high sensitivities for trial conduct can be achieved and further improved with targeted data queries. Outcomes can also be produced comparable to clinical trial and national dataset results. Given the real-time, obligatory nature of the HID, which forms the Hospital Episode Statistics (HES) data, alongside other datasets, we believe routine data extraction and validation is a robust way of rapidly collecting datasets for trials.
|