Statistical method on nonrandom clustering with application to somatic mutations in cancer

<p>Abstract</p> <p>Background</p> <p>Human cancer is caused by the accumulation of tumor-specific mutations in oncogenes and tumor suppressors that confer a selective growth advantage to cells. As a consequence of genomic instability and high levels of proliferation, ma...

Full description

Bibliographic Details
Main Authors: Rejto Paul A, Lunney Elizabeth A, Pavlicek Adam, Ye Jingjing, Teng Chi-Hse
Format: Article
Language:English
Published: BMC 2010-01-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/11/11
id doaj-ac19e0b801794b1a9b0fae94d0ecb484
record_format Article
spelling doaj-ac19e0b801794b1a9b0fae94d0ecb4842020-11-24T21:10:27ZengBMCBMC Bioinformatics1471-21052010-01-011111110.1186/1471-2105-11-11Statistical method on nonrandom clustering with application to somatic mutations in cancerRejto Paul ALunney Elizabeth APavlicek AdamYe JingjingTeng Chi-Hse<p>Abstract</p> <p>Background</p> <p>Human cancer is caused by the accumulation of tumor-specific mutations in oncogenes and tumor suppressors that confer a selective growth advantage to cells. As a consequence of genomic instability and high levels of proliferation, many passenger mutations that do not contribute to the cancer phenotype arise alongside mutations that drive oncogenesis. While several approaches have been developed to separate driver mutations from passengers, few approaches can specifically identify activating driver mutations in oncogenes, which are more amenable for pharmacological intervention.</p> <p>Results</p> <p>We propose a new statistical method for detecting activating mutations in cancer by identifying nonrandom clusters of amino acid mutations in protein sequences. A probability model is derived using order statistics assuming that the location of amino acid mutations on a protein follows a uniform distribution. Our statistical measure is the differences between pair-wise order statistics, which is equivalent to the size of an amino acid mutation cluster, and the probabilities are derived from exact and approximate distributions of the statistical measure. Using data in the Catalog of Somatic Mutations in Cancer (COSMIC) database, we have demonstrated that our method detects well-known clusters of activating mutations in KRAS, BRAF, PI3K, and <it>β</it>-catenin. The method can also identify new cancer targets as well as gain-of-function mutations in tumor suppressors.</p> <p>Conclusions</p> <p>Our proposed method is useful to discover activating driver mutations in cancer by identifying nonrandom clusters of somatic amino acid mutations in protein sequences.</p> http://www.biomedcentral.com/1471-2105/11/11
collection DOAJ
language English
format Article
sources DOAJ
author Rejto Paul A
Lunney Elizabeth A
Pavlicek Adam
Ye Jingjing
Teng Chi-Hse
spellingShingle Rejto Paul A
Lunney Elizabeth A
Pavlicek Adam
Ye Jingjing
Teng Chi-Hse
Statistical method on nonrandom clustering with application to somatic mutations in cancer
BMC Bioinformatics
author_facet Rejto Paul A
Lunney Elizabeth A
Pavlicek Adam
Ye Jingjing
Teng Chi-Hse
author_sort Rejto Paul A
title Statistical method on nonrandom clustering with application to somatic mutations in cancer
title_short Statistical method on nonrandom clustering with application to somatic mutations in cancer
title_full Statistical method on nonrandom clustering with application to somatic mutations in cancer
title_fullStr Statistical method on nonrandom clustering with application to somatic mutations in cancer
title_full_unstemmed Statistical method on nonrandom clustering with application to somatic mutations in cancer
title_sort statistical method on nonrandom clustering with application to somatic mutations in cancer
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2010-01-01
description <p>Abstract</p> <p>Background</p> <p>Human cancer is caused by the accumulation of tumor-specific mutations in oncogenes and tumor suppressors that confer a selective growth advantage to cells. As a consequence of genomic instability and high levels of proliferation, many passenger mutations that do not contribute to the cancer phenotype arise alongside mutations that drive oncogenesis. While several approaches have been developed to separate driver mutations from passengers, few approaches can specifically identify activating driver mutations in oncogenes, which are more amenable for pharmacological intervention.</p> <p>Results</p> <p>We propose a new statistical method for detecting activating mutations in cancer by identifying nonrandom clusters of amino acid mutations in protein sequences. A probability model is derived using order statistics assuming that the location of amino acid mutations on a protein follows a uniform distribution. Our statistical measure is the differences between pair-wise order statistics, which is equivalent to the size of an amino acid mutation cluster, and the probabilities are derived from exact and approximate distributions of the statistical measure. Using data in the Catalog of Somatic Mutations in Cancer (COSMIC) database, we have demonstrated that our method detects well-known clusters of activating mutations in KRAS, BRAF, PI3K, and <it>β</it>-catenin. The method can also identify new cancer targets as well as gain-of-function mutations in tumor suppressors.</p> <p>Conclusions</p> <p>Our proposed method is useful to discover activating driver mutations in cancer by identifying nonrandom clusters of somatic amino acid mutations in protein sequences.</p>
url http://www.biomedcentral.com/1471-2105/11/11
work_keys_str_mv AT rejtopaula statisticalmethodonnonrandomclusteringwithapplicationtosomaticmutationsincancer
AT lunneyelizabetha statisticalmethodonnonrandomclusteringwithapplicationtosomaticmutationsincancer
AT pavlicekadam statisticalmethodonnonrandomclusteringwithapplicationtosomaticmutationsincancer
AT yejingjing statisticalmethodonnonrandomclusteringwithapplicationtosomaticmutationsincancer
AT tengchihse statisticalmethodonnonrandomclusteringwithapplicationtosomaticmutationsincancer
_version_ 1716756483821010944