A Comprehensive Infrastructure for Big Data in Cancer Research: Accelerating Cancer Research and Precision Medicine

Advancements in next-generation sequencing and other -omics technologies are accelerating the detailed molecular characterization of individual patient tumors, and driving the evolution of precision medicine. Cancer is no longer considered a single disease, but rather, a diverse array of diseases wh...

Full description

Bibliographic Details
Main Authors: Izumi V. Hinkson, Tanja M. Davidsen, Juli D. Klemm, Ishwar Chandramouliswaran, Anthony R. Kerlavage, Warren A. Kibbe
Format: Article
Language:English
Published: Frontiers Media S.A. 2017-09-01
Series:Frontiers in Cell and Developmental Biology
Subjects:
Online Access:http://journal.frontiersin.org/article/10.3389/fcell.2017.00083/full
id doaj-6741549dc2f74eb1aa2e849ca5c0e299
record_format Article
spelling doaj-6741549dc2f74eb1aa2e849ca5c0e2992020-11-24T22:53:31ZengFrontiers Media S.A.Frontiers in Cell and Developmental Biology2296-634X2017-09-01510.3389/fcell.2017.00083291496A Comprehensive Infrastructure for Big Data in Cancer Research: Accelerating Cancer Research and Precision MedicineIzumi V. Hinkson0Izumi V. Hinkson1Tanja M. Davidsen2Juli D. Klemm3Ishwar Chandramouliswaran4Anthony R. Kerlavage5Warren A. Kibbe6Warren A. Kibbe7Center for Biomedical Informatics and Information Technology, National Cancer InstituteRockville, MD, United StatesScience and Technology Policy Fellowship Program, American Association for the Advancement of ScienceWashington, DC, United StatesCenter for Biomedical Informatics and Information Technology, National Cancer InstituteRockville, MD, United StatesCenter for Biomedical Informatics and Information Technology, National Cancer InstituteRockville, MD, United StatesOffice of Genomics and Advanced Technologies, National Institute of Allergy and Infectious DiseasesBethesda, MD, United StatesCenter for Biomedical Informatics and Information Technology, National Cancer InstituteRockville, MD, United StatesCenter for Biomedical Informatics and Information Technology, National Cancer InstituteRockville, MD, United StatesDepartment of Biostatistics and Bioinformatics, Duke University School of MedicineDurham, NC, United StatesAdvancements in next-generation sequencing and other -omics technologies are accelerating the detailed molecular characterization of individual patient tumors, and driving the evolution of precision medicine. Cancer is no longer considered a single disease, but rather, a diverse array of diseases wherein each patient has a unique collection of germline variants and somatic mutations. Molecular profiling of patient-derived samples has led to a data explosion that could help us understand the contributions of environment and germline to risk, therapeutic response, and outcome. To maximize the value of these data, an interdisciplinary approach is paramount. The National Cancer Institute (NCI) has initiated multiple projects to characterize tumor samples using multi-omic approaches. These projects harness the expertise of clinicians, biologists, computer scientists, and software engineers to investigate cancer biology and therapeutic response in multidisciplinary teams. Petabytes of cancer genomic, transcriptomic, epigenomic, proteomic, and imaging data have been generated by these projects. To address the data analysis challenges associated with these large datasets, the NCI has sponsored the development of the Genomic Data Commons (GDC) and three Cloud Resources. The GDC ensures data and metadata quality, ingests and harmonizes genomic data, and securely redistributes the data. During its pilot phase, the Cloud Resources tested multiple cloud-based approaches for enhancing data access, collaboration, computational scalability, resource democratization, and reproducibility. These NCI-led efforts are continuously being refined to better support open data practices and precision oncology, and to serve as building blocks of the NCI Cancer Research Data Commons.http://journal.frontiersin.org/article/10.3389/fcell.2017.00083/fullgenomicsproteomicsimagingbig datacancerprecision medicine
collection DOAJ
language English
format Article
sources DOAJ
author Izumi V. Hinkson
Izumi V. Hinkson
Tanja M. Davidsen
Juli D. Klemm
Ishwar Chandramouliswaran
Anthony R. Kerlavage
Warren A. Kibbe
Warren A. Kibbe
spellingShingle Izumi V. Hinkson
Izumi V. Hinkson
Tanja M. Davidsen
Juli D. Klemm
Ishwar Chandramouliswaran
Anthony R. Kerlavage
Warren A. Kibbe
Warren A. Kibbe
A Comprehensive Infrastructure for Big Data in Cancer Research: Accelerating Cancer Research and Precision Medicine
Frontiers in Cell and Developmental Biology
genomics
proteomics
imaging
big data
cancer
precision medicine
author_facet Izumi V. Hinkson
Izumi V. Hinkson
Tanja M. Davidsen
Juli D. Klemm
Ishwar Chandramouliswaran
Anthony R. Kerlavage
Warren A. Kibbe
Warren A. Kibbe
author_sort Izumi V. Hinkson
title A Comprehensive Infrastructure for Big Data in Cancer Research: Accelerating Cancer Research and Precision Medicine
title_short A Comprehensive Infrastructure for Big Data in Cancer Research: Accelerating Cancer Research and Precision Medicine
title_full A Comprehensive Infrastructure for Big Data in Cancer Research: Accelerating Cancer Research and Precision Medicine
title_fullStr A Comprehensive Infrastructure for Big Data in Cancer Research: Accelerating Cancer Research and Precision Medicine
title_full_unstemmed A Comprehensive Infrastructure for Big Data in Cancer Research: Accelerating Cancer Research and Precision Medicine
title_sort comprehensive infrastructure for big data in cancer research: accelerating cancer research and precision medicine
publisher Frontiers Media S.A.
series Frontiers in Cell and Developmental Biology
issn 2296-634X
publishDate 2017-09-01
description Advancements in next-generation sequencing and other -omics technologies are accelerating the detailed molecular characterization of individual patient tumors, and driving the evolution of precision medicine. Cancer is no longer considered a single disease, but rather, a diverse array of diseases wherein each patient has a unique collection of germline variants and somatic mutations. Molecular profiling of patient-derived samples has led to a data explosion that could help us understand the contributions of environment and germline to risk, therapeutic response, and outcome. To maximize the value of these data, an interdisciplinary approach is paramount. The National Cancer Institute (NCI) has initiated multiple projects to characterize tumor samples using multi-omic approaches. These projects harness the expertise of clinicians, biologists, computer scientists, and software engineers to investigate cancer biology and therapeutic response in multidisciplinary teams. Petabytes of cancer genomic, transcriptomic, epigenomic, proteomic, and imaging data have been generated by these projects. To address the data analysis challenges associated with these large datasets, the NCI has sponsored the development of the Genomic Data Commons (GDC) and three Cloud Resources. The GDC ensures data and metadata quality, ingests and harmonizes genomic data, and securely redistributes the data. During its pilot phase, the Cloud Resources tested multiple cloud-based approaches for enhancing data access, collaboration, computational scalability, resource democratization, and reproducibility. These NCI-led efforts are continuously being refined to better support open data practices and precision oncology, and to serve as building blocks of the NCI Cancer Research Data Commons.
topic genomics
proteomics
imaging
big data
cancer
precision medicine
url http://journal.frontiersin.org/article/10.3389/fcell.2017.00083/full
work_keys_str_mv AT izumivhinkson acomprehensiveinfrastructureforbigdataincancerresearchacceleratingcancerresearchandprecisionmedicine
AT izumivhinkson acomprehensiveinfrastructureforbigdataincancerresearchacceleratingcancerresearchandprecisionmedicine
AT tanjamdavidsen acomprehensiveinfrastructureforbigdataincancerresearchacceleratingcancerresearchandprecisionmedicine
AT julidklemm acomprehensiveinfrastructureforbigdataincancerresearchacceleratingcancerresearchandprecisionmedicine
AT ishwarchandramouliswaran acomprehensiveinfrastructureforbigdataincancerresearchacceleratingcancerresearchandprecisionmedicine
AT anthonyrkerlavage acomprehensiveinfrastructureforbigdataincancerresearchacceleratingcancerresearchandprecisionmedicine
AT warrenakibbe acomprehensiveinfrastructureforbigdataincancerresearchacceleratingcancerresearchandprecisionmedicine
AT warrenakibbe acomprehensiveinfrastructureforbigdataincancerresearchacceleratingcancerresearchandprecisionmedicine
AT izumivhinkson comprehensiveinfrastructureforbigdataincancerresearchacceleratingcancerresearchandprecisionmedicine
AT izumivhinkson comprehensiveinfrastructureforbigdataincancerresearchacceleratingcancerresearchandprecisionmedicine
AT tanjamdavidsen comprehensiveinfrastructureforbigdataincancerresearchacceleratingcancerresearchandprecisionmedicine
AT julidklemm comprehensiveinfrastructureforbigdataincancerresearchacceleratingcancerresearchandprecisionmedicine
AT ishwarchandramouliswaran comprehensiveinfrastructureforbigdataincancerresearchacceleratingcancerresearchandprecisionmedicine
AT anthonyrkerlavage comprehensiveinfrastructureforbigdataincancerresearchacceleratingcancerresearchandprecisionmedicine
AT warrenakibbe comprehensiveinfrastructureforbigdataincancerresearchacceleratingcancerresearchandprecisionmedicine
AT warrenakibbe comprehensiveinfrastructureforbigdataincancerresearchacceleratingcancerresearchandprecisionmedicine
_version_ 1725663055227912192