Double sampling with multiple imputation to answer large sample meta-research questions: Introduction and illustration by evaluating adherence to two simple CONSORT guidelines

BACKGROUND: Meta-research can involve manual retrieval and evaluation of research, which is resource intensive. Creation of high throughput methods (e.g., search heuristics, crowdsourcing) has improved feasibility of large meta-research questions, but possibly at the cost of accuracy. OBJECTIVE: To...

Full description

Bibliographic Details
Main Authors: Patrice L. Capers, Andrew W Brown, John eDawson, David B Allison
Format: Article
Language:English
Published: Frontiers Media S.A. 2015-03-01
Series:Frontiers in Nutrition
Subjects:
Online Access:http://journal.frontiersin.org/Journal/10.3389/fnut.2015.00006/full
Description
Summary:BACKGROUND: Meta-research can involve manual retrieval and evaluation of research, which is resource intensive. Creation of high throughput methods (e.g., search heuristics, crowdsourcing) has improved feasibility of large meta-research questions, but possibly at the cost of accuracy. OBJECTIVE: To evaluate the use of double sampling combined with multiple imputation (DS+MI) to address meta-research questions, using as an example adherence of PubMed entries to two simple Consolidated Standards of Reporting Trials (CONSORT) guidelines for titles and abstracts. METHODS: For the DS large sample, we retrieved all PubMed entries satisfying the filters: RCT; human; abstract available; and English language (n=322,107). For the DS subsample, we randomly sampled 500 entries from the large sample. The large sample was evaluated with a lower rigor, higher throughput (RLOTHI) method using search heuristics, while the subsample was evaluated using a higher rigor, lower throughput (RHITLO) human rating method. Multiple imputation of the missing-completely-at-random RHITLO data for the large sample was informed by: RHITLO data from the subsample; RLOTHI data from the large sample; whether a study was an RCT; and country and year of publication. RESULTS: The RHITLO and RLOTHI methods in the subsample largely agreed (phi coefficients: title=1.00, abstract=0.92). Compliance with abstract and title criteria has increased over time, with non-US countries improving more rapidly. DS+MI logistic regression estimates were more precise than subsample estimates (e.g., 95% CI for change in title and abstract compliance by Year: subsample RHITLO 1.050-1.174 vs. DS+MI 1.082-1.151). As evidence of improved accuracy, DS+MI coefficient estimates were closer to RHITLO than the large sample RLOTHI. CONCLUSIONS: Our results support our hypothesis that DS+MI would result in improved precision and accuracy. This method is flexible and may provide a practical way to examine large corpora of literature.
ISSN:2296-861X