"Yes, but will it work for my patients?" Driving clinically relevant research with benchmark datasets

Benchmark datasets have a powerful normative influence: by determining how the real world is represented in data, they define which problems will first be solved by algorithms built using the datasets and, by extension, who these algorithms will work for. It is desirable for these datasets to serve...

Full description

Bibliographic Details
Main Authors: Panch, Trishan (Author), Pollard, Tom Joseph (Author), Mattie, Heather (Author), Lindemer, Emily (Author), Keane, Pearse A. (Author), Celi, Leo Anthony G. (Author)
Other Authors: Massachusetts Institute of Technology. Institute for Medical Engineering & Science (Contributor)
Format: Article
Language:English
Published: Springer Science and Business Media LLC, 2020-08-13T22:03:39Z.
Subjects:
Online Access:Get fulltext
LEADER 02495 am a22002293u 4500
001 126577
042 |a dc 
100 1 0 |a Panch, Trishan  |e author 
100 1 0 |a Massachusetts Institute of Technology. Institute for Medical Engineering & Science  |e contributor 
700 1 0 |a Pollard, Tom Joseph  |e author 
700 1 0 |a Mattie, Heather  |e author 
700 1 0 |a Lindemer, Emily  |e author 
700 1 0 |a Keane, Pearse A.  |e author 
700 1 0 |a Celi, Leo Anthony G.  |e author 
245 0 0 |a "Yes, but will it work for my patients?" Driving clinically relevant research with benchmark datasets 
260 |b Springer Science and Business Media LLC,   |c 2020-08-13T22:03:39Z. 
856 |z Get fulltext  |u https://hdl.handle.net/1721.1/126577 
520 |a Benchmark datasets have a powerful normative influence: by determining how the real world is represented in data, they define which problems will first be solved by algorithms built using the datasets and, by extension, who these algorithms will work for. It is desirable for these datasets to serve four functions: (1) enabling the creation of clinically relevant algorithms; (2) facilitating like-for-like comparison of algorithmic performance; (3) ensuring reproducibility of algorithms; (4) asserting a normative influence on the clinical domains and diversity of patients that will potentially benefit from technological advances. Without benchmark datasets that satisfy these functions, it is impossible to address two perennial concerns of clinicians experienced in computational research: "the data scientists just go where the data is rather than where the needs are," and, "yes, but will this work for my patients?" If algorithms are to be developed and applied for the care of patients, then it is prudent for the research community to create benchmark datasets proactively, across specialties. As yet, best practice in this area has not been defined. Broadly speaking, efforts will include design of the dataset; compliance and contracting issues relating to the sharing of sensitive data; enabling access and reuse; and planning for translation of algorithms to the clinical environment. If a deliberate and systematic approach is not followed, not only will the considerable benefits of clinical algorithms fail to be realized, but the potential harms may be regressively incurred across existing gradients of social inequity. 
520 |a National Institutes of Health (Grant R01 EV017205) 
546 |a en 
655 7 |a Article 
773 |t npj Digital Medicine