Personalized pathway enrichment map of putative cancer genes from next generation sequencing data.

BACKGROUND: Pathway analysis of a set of genes represents an important area in large-scale omic data analysis. However, the application of traditional pathway enrichment methods to next-generation sequencing (NGS) data is prone to several potential biases, including genomic/genetic factors (e.g., th...

Full description

Bibliographic Details
Main Authors: Peilin Jia, Zhongming Zhao
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2012-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3356304?pdf=render
id doaj-e5dc1dfe4f814e69a396065307883ce4
record_format Article
spelling doaj-e5dc1dfe4f814e69a396065307883ce42020-11-24T21:17:53ZengPublic Library of Science (PLoS)PLoS ONE1932-62032012-01-0175e3759510.1371/journal.pone.0037595Personalized pathway enrichment map of putative cancer genes from next generation sequencing data.Peilin JiaZhongming ZhaoBACKGROUND: Pathway analysis of a set of genes represents an important area in large-scale omic data analysis. However, the application of traditional pathway enrichment methods to next-generation sequencing (NGS) data is prone to several potential biases, including genomic/genetic factors (e.g., the particular disease and gene length) and environmental factors (e.g., personal life-style and frequency and dosage of exposure to mutagens). Therefore, novel methods are urgently needed for these new data types, especially for individual-specific genome data. METHODOLOGY: In this study, we proposed a novel method for the pathway analysis of NGS mutation data by explicitly taking into account the gene-wise mutation rate. We estimated the gene-wise mutation rate based on the individual-specific background mutation rate along with the gene length. Taking the mutation rate as a weight for each gene, our weighted resampling strategy builds the null distribution for each pathway while matching the gene length patterns. The empirical P value obtained then provides an adjusted statistical evaluation. PRINCIPAL FINDINGS/CONCLUSIONS: We demonstrated our weighted resampling method to a lung adenocarcinomas dataset and a glioblastoma dataset, and compared it to other widely applied methods. By explicitly adjusting gene-length, the weighted resampling method performs as well as the standard methods for significant pathways with strong evidence. Importantly, our method could effectively reject many marginally significant pathways detected by standard methods, including several long-gene-based, cancer-unrelated pathways. We further demonstrated that by reducing such biases, pathway crosstalk for each individual and pathway co-mutation map across multiple individuals can be objectively explored and evaluated. This method performs pathway analysis in a sample-centered fashion, and provides an alternative way for accurate analysis of cancer-personalized genomes. It can be extended to other types of genomic data (genotyping and methylation) that have similar bias problems.http://europepmc.org/articles/PMC3356304?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Peilin Jia
Zhongming Zhao
spellingShingle Peilin Jia
Zhongming Zhao
Personalized pathway enrichment map of putative cancer genes from next generation sequencing data.
PLoS ONE
author_facet Peilin Jia
Zhongming Zhao
author_sort Peilin Jia
title Personalized pathway enrichment map of putative cancer genes from next generation sequencing data.
title_short Personalized pathway enrichment map of putative cancer genes from next generation sequencing data.
title_full Personalized pathway enrichment map of putative cancer genes from next generation sequencing data.
title_fullStr Personalized pathway enrichment map of putative cancer genes from next generation sequencing data.
title_full_unstemmed Personalized pathway enrichment map of putative cancer genes from next generation sequencing data.
title_sort personalized pathway enrichment map of putative cancer genes from next generation sequencing data.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2012-01-01
description BACKGROUND: Pathway analysis of a set of genes represents an important area in large-scale omic data analysis. However, the application of traditional pathway enrichment methods to next-generation sequencing (NGS) data is prone to several potential biases, including genomic/genetic factors (e.g., the particular disease and gene length) and environmental factors (e.g., personal life-style and frequency and dosage of exposure to mutagens). Therefore, novel methods are urgently needed for these new data types, especially for individual-specific genome data. METHODOLOGY: In this study, we proposed a novel method for the pathway analysis of NGS mutation data by explicitly taking into account the gene-wise mutation rate. We estimated the gene-wise mutation rate based on the individual-specific background mutation rate along with the gene length. Taking the mutation rate as a weight for each gene, our weighted resampling strategy builds the null distribution for each pathway while matching the gene length patterns. The empirical P value obtained then provides an adjusted statistical evaluation. PRINCIPAL FINDINGS/CONCLUSIONS: We demonstrated our weighted resampling method to a lung adenocarcinomas dataset and a glioblastoma dataset, and compared it to other widely applied methods. By explicitly adjusting gene-length, the weighted resampling method performs as well as the standard methods for significant pathways with strong evidence. Importantly, our method could effectively reject many marginally significant pathways detected by standard methods, including several long-gene-based, cancer-unrelated pathways. We further demonstrated that by reducing such biases, pathway crosstalk for each individual and pathway co-mutation map across multiple individuals can be objectively explored and evaluated. This method performs pathway analysis in a sample-centered fashion, and provides an alternative way for accurate analysis of cancer-personalized genomes. It can be extended to other types of genomic data (genotyping and methylation) that have similar bias problems.
url http://europepmc.org/articles/PMC3356304?pdf=render
work_keys_str_mv AT peilinjia personalizedpathwayenrichmentmapofputativecancergenesfromnextgenerationsequencingdata
AT zhongmingzhao personalizedpathwayenrichmentmapofputativecancergenesfromnextgenerationsequencingdata
_version_ 1726011608500535296