Interactogeneous: disease gene prioritization using heterogeneous networks and full topology scores.

Disease gene prioritization aims to suggest potential implications of genes in disease susceptibility. Often accomplished in a guilt-by-association scheme, promising candidates are sorted according to their relatedness to known disease genes. Network-based methods have been successfully exploiting t...

Full description

Bibliographic Details
Main Authors: Joana P Gonçalves, Alexandre P Francisco, Yves Moreau, Sara C Madeira
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2012-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3501465?pdf=render
id doaj-76b7b7604bce4ddb9719c3ac0715b140
record_format Article
spelling doaj-76b7b7604bce4ddb9719c3ac0715b1402020-11-25T00:03:27ZengPublic Library of Science (PLoS)PLoS ONE1932-62032012-01-01711e4963410.1371/journal.pone.0049634Interactogeneous: disease gene prioritization using heterogeneous networks and full topology scores.Joana P GonçalvesAlexandre P FranciscoYves MoreauSara C MadeiraDisease gene prioritization aims to suggest potential implications of genes in disease susceptibility. Often accomplished in a guilt-by-association scheme, promising candidates are sorted according to their relatedness to known disease genes. Network-based methods have been successfully exploiting this concept by capturing the interaction of genes or proteins into a score. Nonetheless, most current approaches yield at least some of the following limitations: (1) networks comprise only curated physical interactions leading to poor genome coverage and density, and bias toward a particular source; (2) scores focus on adjacencies (direct links) or the most direct paths (shortest paths) within a constrained neighborhood around the disease genes, ignoring potentially informative indirect paths; (3) global clustering is widely applied to partition the network in an unsupervised manner, attributing little importance to prior knowledge; (4) confidence weights and their contribution to edge differentiation and ranking reliability are often disregarded. We hypothesize that network-based prioritization related to local clustering on graphs and considering full topology of weighted gene association networks integrating heterogeneous sources should overcome the above challenges. We term such a strategy Interactogeneous. We conducted cross-validation tests to assess the impact of network sources, alternative path inclusion and confidence weights on the prioritization of putative genes for 29 diseases. Heat diffusion ranking proved the best prioritization method overall, increasing the gap to neighborhood and shortest paths scores mostly on single source networks. Heterogeneous associations consistently delivered superior performance over single source data across the majority of methods. Results on the contribution of confidence weights were inconclusive. Finally, the best Interactogeneous strategy, heat diffusion ranking and associations from the STRING database, was used to prioritize genes for Parkinson's disease. This method effectively recovered known genes and uncovered interesting candidates which could be linked to pathogenic mechanisms of the disease.http://europepmc.org/articles/PMC3501465?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Joana P Gonçalves
Alexandre P Francisco
Yves Moreau
Sara C Madeira
spellingShingle Joana P Gonçalves
Alexandre P Francisco
Yves Moreau
Sara C Madeira
Interactogeneous: disease gene prioritization using heterogeneous networks and full topology scores.
PLoS ONE
author_facet Joana P Gonçalves
Alexandre P Francisco
Yves Moreau
Sara C Madeira
author_sort Joana P Gonçalves
title Interactogeneous: disease gene prioritization using heterogeneous networks and full topology scores.
title_short Interactogeneous: disease gene prioritization using heterogeneous networks and full topology scores.
title_full Interactogeneous: disease gene prioritization using heterogeneous networks and full topology scores.
title_fullStr Interactogeneous: disease gene prioritization using heterogeneous networks and full topology scores.
title_full_unstemmed Interactogeneous: disease gene prioritization using heterogeneous networks and full topology scores.
title_sort interactogeneous: disease gene prioritization using heterogeneous networks and full topology scores.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2012-01-01
description Disease gene prioritization aims to suggest potential implications of genes in disease susceptibility. Often accomplished in a guilt-by-association scheme, promising candidates are sorted according to their relatedness to known disease genes. Network-based methods have been successfully exploiting this concept by capturing the interaction of genes or proteins into a score. Nonetheless, most current approaches yield at least some of the following limitations: (1) networks comprise only curated physical interactions leading to poor genome coverage and density, and bias toward a particular source; (2) scores focus on adjacencies (direct links) or the most direct paths (shortest paths) within a constrained neighborhood around the disease genes, ignoring potentially informative indirect paths; (3) global clustering is widely applied to partition the network in an unsupervised manner, attributing little importance to prior knowledge; (4) confidence weights and their contribution to edge differentiation and ranking reliability are often disregarded. We hypothesize that network-based prioritization related to local clustering on graphs and considering full topology of weighted gene association networks integrating heterogeneous sources should overcome the above challenges. We term such a strategy Interactogeneous. We conducted cross-validation tests to assess the impact of network sources, alternative path inclusion and confidence weights on the prioritization of putative genes for 29 diseases. Heat diffusion ranking proved the best prioritization method overall, increasing the gap to neighborhood and shortest paths scores mostly on single source networks. Heterogeneous associations consistently delivered superior performance over single source data across the majority of methods. Results on the contribution of confidence weights were inconclusive. Finally, the best Interactogeneous strategy, heat diffusion ranking and associations from the STRING database, was used to prioritize genes for Parkinson's disease. This method effectively recovered known genes and uncovered interesting candidates which could be linked to pathogenic mechanisms of the disease.
url http://europepmc.org/articles/PMC3501465?pdf=render
work_keys_str_mv AT joanapgoncalves interactogeneousdiseasegeneprioritizationusingheterogeneousnetworksandfulltopologyscores
AT alexandrepfrancisco interactogeneousdiseasegeneprioritizationusingheterogeneousnetworksandfulltopologyscores
AT yvesmoreau interactogeneousdiseasegeneprioritizationusingheterogeneousnetworksandfulltopologyscores
AT saracmadeira interactogeneousdiseasegeneprioritizationusingheterogeneousnetworksandfulltopologyscores
_version_ 1725433882263683072