Comprehensive assessment of computational algorithms in predicting cancer driver mutations

Abstract Background The initiation and subsequent evolution of cancer are largely driven by a relatively small number of somatic mutations with critical functional impacts, so-called driver mutations. Identifying driver mutations in a patient’s tumor cells is a central task in the era of precision c...

Full description

Bibliographic Details
Main Authors:	Hu Chen, Jun Li, Yumeng Wang, Patrick Kwok-Shing Ng, Yiu Huen Tsang, Kenna R. Shaw, Gordon B. Mills, Han Liang
Format:	Article
Language:	English
Published:	BMC 2020-02-01
Series:	Genome Biology
Subjects:	The Cancer Genome Atlas Driver mutations Passenger mutations 3D clustering TP53 mutations Tumor transformation
Online Access:	http://link.springer.com/article/10.1186/s13059-020-01954-z

id	doaj-1c753689051a492c944a9e2334f8a8f8
record_format	Article
spelling	doaj-1c753689051a492c944a9e2334f8a8f82020-11-25T01:41:57ZengBMCGenome Biology1474-760X2020-02-0121111710.1186/s13059-020-01954-zComprehensive assessment of computational algorithms in predicting cancer driver mutationsHu Chen0Jun Li1Yumeng Wang2Patrick Kwok-Shing Ng3Yiu Huen Tsang4Kenna R. Shaw5Gordon B. Mills6Han Liang7Graduate Program in Quantitative and Computational Biosciences, Baylor College of MedicineDepartment of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer CenterDepartment of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer CenterInstitute for Personalized Cancer Therapy, The University of Texas MD Anderson Cancer CenterDepartment of Cell, Developmental & Cancer Biology, Knight Cancer Institute, Oregon Health Sciences UniversityInstitute for Personalized Cancer Therapy, The University of Texas MD Anderson Cancer CenterDepartment of Cell, Developmental & Cancer Biology, Knight Cancer Institute, Oregon Health Sciences UniversityDepartment of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer CenterAbstract Background The initiation and subsequent evolution of cancer are largely driven by a relatively small number of somatic mutations with critical functional impacts, so-called driver mutations. Identifying driver mutations in a patient’s tumor cells is a central task in the era of precision cancer medicine. Over the decade, many computational algorithms have been developed to predict the effects of missense single-nucleotide variants, and they are frequently employed to prioritize mutation candidates. These algorithms employ diverse molecular features to build predictive models, and while some algorithms are cancer-specific, others are not. However, the relative performance of these algorithms has not been rigorously assessed. Results We construct five complementary benchmark datasets: mutation clustering patterns in the protein 3D structures, literature annotation based on OncoKB, TP53 mutations based on their effects on target-gene transactivation, effects of cancer mutations on tumor formation in xenograft experiments, and functional annotation based on in vitro cell viability assays we developed including a new dataset of ~ 200 mutations. We evaluate the performance of 33 algorithms and found that CHASM, CTAT-cancer, DEOGEN2, and PrimateAI show consistently better performance than the other algorithms. Moreover, cancer-specific algorithms show much better performance than those designed for a general purpose. Conclusions Our study is a comprehensive assessment of the performance of different algorithms in predicting cancer driver mutations and provides deep insights into the best practice of computationally prioritizing cancer mutation candidates for end-users and for the future development of new algorithms.http://link.springer.com/article/10.1186/s13059-020-01954-zThe Cancer Genome AtlasDriver mutationsPassenger mutations3D clusteringTP53 mutationsTumor transformation
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Hu Chen Jun Li Yumeng Wang Patrick Kwok-Shing Ng Yiu Huen Tsang Kenna R. Shaw Gordon B. Mills Han Liang
spellingShingle	Hu Chen Jun Li Yumeng Wang Patrick Kwok-Shing Ng Yiu Huen Tsang Kenna R. Shaw Gordon B. Mills Han Liang Comprehensive assessment of computational algorithms in predicting cancer driver mutations Genome Biology The Cancer Genome Atlas Driver mutations Passenger mutations 3D clustering TP53 mutations Tumor transformation
author_facet	Hu Chen Jun Li Yumeng Wang Patrick Kwok-Shing Ng Yiu Huen Tsang Kenna R. Shaw Gordon B. Mills Han Liang
author_sort	Hu Chen
title	Comprehensive assessment of computational algorithms in predicting cancer driver mutations
title_short	Comprehensive assessment of computational algorithms in predicting cancer driver mutations
title_full	Comprehensive assessment of computational algorithms in predicting cancer driver mutations
title_fullStr	Comprehensive assessment of computational algorithms in predicting cancer driver mutations
title_full_unstemmed	Comprehensive assessment of computational algorithms in predicting cancer driver mutations
title_sort	comprehensive assessment of computational algorithms in predicting cancer driver mutations
publisher	BMC
series	Genome Biology
issn	1474-760X
publishDate	2020-02-01
description	Abstract Background The initiation and subsequent evolution of cancer are largely driven by a relatively small number of somatic mutations with critical functional impacts, so-called driver mutations. Identifying driver mutations in a patient’s tumor cells is a central task in the era of precision cancer medicine. Over the decade, many computational algorithms have been developed to predict the effects of missense single-nucleotide variants, and they are frequently employed to prioritize mutation candidates. These algorithms employ diverse molecular features to build predictive models, and while some algorithms are cancer-specific, others are not. However, the relative performance of these algorithms has not been rigorously assessed. Results We construct five complementary benchmark datasets: mutation clustering patterns in the protein 3D structures, literature annotation based on OncoKB, TP53 mutations based on their effects on target-gene transactivation, effects of cancer mutations on tumor formation in xenograft experiments, and functional annotation based on in vitro cell viability assays we developed including a new dataset of ~ 200 mutations. We evaluate the performance of 33 algorithms and found that CHASM, CTAT-cancer, DEOGEN2, and PrimateAI show consistently better performance than the other algorithms. Moreover, cancer-specific algorithms show much better performance than those designed for a general purpose. Conclusions Our study is a comprehensive assessment of the performance of different algorithms in predicting cancer driver mutations and provides deep insights into the best practice of computationally prioritizing cancer mutation candidates for end-users and for the future development of new algorithms.
topic	The Cancer Genome Atlas Driver mutations Passenger mutations 3D clustering TP53 mutations Tumor transformation
url	http://link.springer.com/article/10.1186/s13059-020-01954-z
work_keys_str_mv	AT huchen comprehensiveassessmentofcomputationalalgorithmsinpredictingcancerdrivermutations AT junli comprehensiveassessmentofcomputationalalgorithmsinpredictingcancerdrivermutations AT yumengwang comprehensiveassessmentofcomputationalalgorithmsinpredictingcancerdrivermutations AT patrickkwokshingng comprehensiveassessmentofcomputationalalgorithmsinpredictingcancerdrivermutations AT yiuhuentsang comprehensiveassessmentofcomputationalalgorithmsinpredictingcancerdrivermutations AT kennarshaw comprehensiveassessmentofcomputationalalgorithmsinpredictingcancerdrivermutations AT gordonbmills comprehensiveassessmentofcomputationalalgorithmsinpredictingcancerdrivermutations AT hanliang comprehensiveassessmentofcomputationalalgorithmsinpredictingcancerdrivermutations
_version_	1725038723483041792

Comprehensive assessment of computational algorithms in predicting cancer driver mutations

Similar Items