A comparative study of topology-based pathway enrichment analysis methods

Abstract Background Pathway enrichment extensively used in the analysis of Omics data for gaining biological insights into the functional roles of pre-defined subsets of genes, proteins and metabolites. A large number of methods have been proposed in the literature for this task. The vast majority o...

Full description

Bibliographic Details
Main Authors: Jing Ma, Ali Shojaie, George Michailidis
Format: Article
Language:English
Published: BMC 2019-11-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-019-3146-1
id doaj-b45a3262994f415aac024d4b021849c4
record_format Article
spelling doaj-b45a3262994f415aac024d4b021849c42020-11-25T04:08:53ZengBMCBMC Bioinformatics1471-21052019-11-0120111410.1186/s12859-019-3146-1A comparative study of topology-based pathway enrichment analysis methodsJing Ma0Ali Shojaie1George Michailidis2Texas A&M UniversityUniversity of WashingtonUniversity of FloridaAbstract Background Pathway enrichment extensively used in the analysis of Omics data for gaining biological insights into the functional roles of pre-defined subsets of genes, proteins and metabolites. A large number of methods have been proposed in the literature for this task. The vast majority of these methods use as input expression levels of the biomolecules under study together with their membership in pathways of interest. The latest generation of pathway enrichment methods also leverages information on the topology of the underlying pathways, which as evidence from their evaluation reveals, lead to improved sensitivity and specificity. Nevertheless, a systematic empirical comparison of such methods is still lacking, making selection of the most suitable method for a specific experimental setting challenging. This comparative study of nine network-based methods for pathway enrichment analysis aims to provide a systematic evaluation of their performance based on three real data sets with different number of features (genes/metabolites) and number of samples. Results The findings highlight both methodological and empirical differences across the nine methods. In particular, certain methods assess pathway enrichment due to differences both across expression levels and in the strength of the interconnectedness of the members of the pathway, while others only leverage differential expression levels. In the more challenging setting involving a metabolomics data set, the results show that methods that utilize both pieces of information (with NetGSA being a prototypical one) exhibit superior statistical power in detecting pathway enrichment. Conclusion The analysis reveals that a number of methods perform equally well when testing large size pathways, which is the case with genomic data. On the other hand, NetGSA that takes into consideration both differential expression of the biomolecules in the pathway, as well as changes in the topology exhibits a superior performance when testing small size pathways, which is usually the case for metabolomics data.http://link.springer.com/article/10.1186/s12859-019-3146-1Pathway enrichment analysisPathway topologyType I errorPowerDifferential network biology
collection DOAJ
language English
format Article
sources DOAJ
author Jing Ma
Ali Shojaie
George Michailidis
spellingShingle Jing Ma
Ali Shojaie
George Michailidis
A comparative study of topology-based pathway enrichment analysis methods
BMC Bioinformatics
Pathway enrichment analysis
Pathway topology
Type I error
Power
Differential network biology
author_facet Jing Ma
Ali Shojaie
George Michailidis
author_sort Jing Ma
title A comparative study of topology-based pathway enrichment analysis methods
title_short A comparative study of topology-based pathway enrichment analysis methods
title_full A comparative study of topology-based pathway enrichment analysis methods
title_fullStr A comparative study of topology-based pathway enrichment analysis methods
title_full_unstemmed A comparative study of topology-based pathway enrichment analysis methods
title_sort comparative study of topology-based pathway enrichment analysis methods
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2019-11-01
description Abstract Background Pathway enrichment extensively used in the analysis of Omics data for gaining biological insights into the functional roles of pre-defined subsets of genes, proteins and metabolites. A large number of methods have been proposed in the literature for this task. The vast majority of these methods use as input expression levels of the biomolecules under study together with their membership in pathways of interest. The latest generation of pathway enrichment methods also leverages information on the topology of the underlying pathways, which as evidence from their evaluation reveals, lead to improved sensitivity and specificity. Nevertheless, a systematic empirical comparison of such methods is still lacking, making selection of the most suitable method for a specific experimental setting challenging. This comparative study of nine network-based methods for pathway enrichment analysis aims to provide a systematic evaluation of their performance based on three real data sets with different number of features (genes/metabolites) and number of samples. Results The findings highlight both methodological and empirical differences across the nine methods. In particular, certain methods assess pathway enrichment due to differences both across expression levels and in the strength of the interconnectedness of the members of the pathway, while others only leverage differential expression levels. In the more challenging setting involving a metabolomics data set, the results show that methods that utilize both pieces of information (with NetGSA being a prototypical one) exhibit superior statistical power in detecting pathway enrichment. Conclusion The analysis reveals that a number of methods perform equally well when testing large size pathways, which is the case with genomic data. On the other hand, NetGSA that takes into consideration both differential expression of the biomolecules in the pathway, as well as changes in the topology exhibits a superior performance when testing small size pathways, which is usually the case for metabolomics data.
topic Pathway enrichment analysis
Pathway topology
Type I error
Power
Differential network biology
url http://link.springer.com/article/10.1186/s12859-019-3146-1
work_keys_str_mv AT jingma acomparativestudyoftopologybasedpathwayenrichmentanalysismethods
AT alishojaie acomparativestudyoftopologybasedpathwayenrichmentanalysismethods
AT georgemichailidis acomparativestudyoftopologybasedpathwayenrichmentanalysismethods
AT jingma comparativestudyoftopologybasedpathwayenrichmentanalysismethods
AT alishojaie comparativestudyoftopologybasedpathwayenrichmentanalysismethods
AT georgemichailidis comparativestudyoftopologybasedpathwayenrichmentanalysismethods
_version_ 1724424173067960320