Bringing Code to Data: Do Not Forget Governance

Developing or independently evaluating algorithms in biomedical research is difficult because of restrictions on access to clinical data. Access is restricted because of privacy concerns, the proprietary treatment of data by institutions (fueled in part by the cost of data hosting, curati...

Full description

Bibliographic Details
Main Authors: Suver, Christine, Thorogood, Adrian, Doerr, Megan, Wilbanks, John, Knoppers, Bartha
Format: Article
Language:English
Published: JMIR Publications 2020-07-01
Series:Journal of Medical Internet Research
Online Access:http://www.jmir.org/2020/7/e18087/
id doaj-a9776db9ef434250a21ef021d0772e58
record_format Article
spelling doaj-a9776db9ef434250a21ef021d0772e582021-04-02T18:56:36ZengJMIR PublicationsJournal of Medical Internet Research1438-88712020-07-01227e1808710.2196/18087Bringing Code to Data: Do Not Forget GovernanceSuver, ChristineThorogood, AdrianDoerr, MeganWilbanks, JohnKnoppers, Bartha Developing or independently evaluating algorithms in biomedical research is difficult because of restrictions on access to clinical data. Access is restricted because of privacy concerns, the proprietary treatment of data by institutions (fueled in part by the cost of data hosting, curation, and distribution), concerns over misuse, and the complexities of applicable regulatory frameworks. The use of cloud technology and services can address many of the barriers to data sharing. For example, researchers can access data in high performance, secure, and auditable cloud computing environments without the need for copying or downloading. An alternative path to accessing data sets requiring additional protection is the model-to-data approach. In model-to-data, researchers submit algorithms to run on secure data sets that remain hidden. Model-to-data is designed to enhance security and local control while enabling communities of researchers to generate new knowledge from sequestered data. Model-to-data has not yet been widely implemented, but pilots have demonstrated its utility when technical or legal constraints preclude other methods of sharing. We argue that model-to-data can make a valuable addition to our data sharing arsenal, with 2 caveats. First, model-to-data should only be adopted where necessary to supplement rather than replace existing data-sharing approaches given that it requires significant resource commitments from data stewards and limits scientific freedom, reproducibility, and scalability. Second, although model-to-data reduces concerns over data privacy and loss of local control when sharing clinical data, it is not an ethical panacea. Data stewards will remain hesitant to adopt model-to-data approaches without guidance on how to do so responsibly. To address this gap, we explored how commitments to open science, reproducibility, security, respect for data subjects, and research ethics oversight must be re-evaluated in a model-to-data context.http://www.jmir.org/2020/7/e18087/
collection DOAJ
language English
format Article
sources DOAJ
author Suver, Christine
Thorogood, Adrian
Doerr, Megan
Wilbanks, John
Knoppers, Bartha
spellingShingle Suver, Christine
Thorogood, Adrian
Doerr, Megan
Wilbanks, John
Knoppers, Bartha
Bringing Code to Data: Do Not Forget Governance
Journal of Medical Internet Research
author_facet Suver, Christine
Thorogood, Adrian
Doerr, Megan
Wilbanks, John
Knoppers, Bartha
author_sort Suver, Christine
title Bringing Code to Data: Do Not Forget Governance
title_short Bringing Code to Data: Do Not Forget Governance
title_full Bringing Code to Data: Do Not Forget Governance
title_fullStr Bringing Code to Data: Do Not Forget Governance
title_full_unstemmed Bringing Code to Data: Do Not Forget Governance
title_sort bringing code to data: do not forget governance
publisher JMIR Publications
series Journal of Medical Internet Research
issn 1438-8871
publishDate 2020-07-01
description Developing or independently evaluating algorithms in biomedical research is difficult because of restrictions on access to clinical data. Access is restricted because of privacy concerns, the proprietary treatment of data by institutions (fueled in part by the cost of data hosting, curation, and distribution), concerns over misuse, and the complexities of applicable regulatory frameworks. The use of cloud technology and services can address many of the barriers to data sharing. For example, researchers can access data in high performance, secure, and auditable cloud computing environments without the need for copying or downloading. An alternative path to accessing data sets requiring additional protection is the model-to-data approach. In model-to-data, researchers submit algorithms to run on secure data sets that remain hidden. Model-to-data is designed to enhance security and local control while enabling communities of researchers to generate new knowledge from sequestered data. Model-to-data has not yet been widely implemented, but pilots have demonstrated its utility when technical or legal constraints preclude other methods of sharing. We argue that model-to-data can make a valuable addition to our data sharing arsenal, with 2 caveats. First, model-to-data should only be adopted where necessary to supplement rather than replace existing data-sharing approaches given that it requires significant resource commitments from data stewards and limits scientific freedom, reproducibility, and scalability. Second, although model-to-data reduces concerns over data privacy and loss of local control when sharing clinical data, it is not an ethical panacea. Data stewards will remain hesitant to adopt model-to-data approaches without guidance on how to do so responsibly. To address this gap, we explored how commitments to open science, reproducibility, security, respect for data subjects, and research ethics oversight must be re-evaluated in a model-to-data context.
url http://www.jmir.org/2020/7/e18087/
work_keys_str_mv AT suverchristine bringingcodetodatadonotforgetgovernance
AT thorogoodadrian bringingcodetodatadonotforgetgovernance
AT doerrmegan bringingcodetodatadonotforgetgovernance
AT wilbanksjohn bringingcodetodatadonotforgetgovernance
AT knoppersbartha bringingcodetodatadonotforgetgovernance
_version_ 1721550402115076096