Efficient and accurate identification of protein complexes from protein-protein interaction networks based on the clustering coefficient

Identification of protein complexes from protein-protein interaction (PPI) networks is a key problem in PPI mining, solved by parameter-dependent approaches that suffer from small recall rates. Here we introduce GCC-v, a family of efficient, parameter-free algorithms to accurately predict protein co...

Full description

Bibliographic Details
Main Authors:	Sara Omranian, Angela Angeleska, Zoran Nikoloski
Format:	Article
Language:	English
Published:	Elsevier 2021-01-01
Series:	Computational and Structural Biotechnology Journal
Subjects:	Protein complexes Protein-protein interaction Network clustering Species comparison
Online Access:	http://www.sciencedirect.com/science/article/pii/S2001037021003998

id	doaj-f5797053a4854c6599d2f5073df12f9a
record_format	Article
spelling	doaj-f5797053a4854c6599d2f5073df12f9a2021-09-27T04:24:50ZengElsevierComputational and Structural Biotechnology Journal2001-03702021-01-011952555263Efficient and accurate identification of protein complexes from protein-protein interaction networks based on the clustering coefficientSara Omranian0Angela Angeleska1Zoran Nikoloski2Bioinformatics, Institute of Biochemistry and Biology, University of Potsdam, 14476 Potsdam, Germany; Systems Biology and Mathematical Modeling, Max Planck Institute of Molecular Plant Physiology, 14476 Potsdam, GermanyMathematics Department, University of Tampa, Tampa, FL, USABioinformatics, Institute of Biochemistry and Biology, University of Potsdam, 14476 Potsdam, Germany; Systems Biology and Mathematical Modeling, Max Planck Institute of Molecular Plant Physiology, 14476 Potsdam, Germany; Corresponding author at: Bioinformatics, Institute of Biochemistry and Biology, University of Potsdam, 14476 Potsdam, Germany; Systems Biology and Mathematical Modeling Group, Max Planck Institute of Molecular Plant Physiology, 14476 Potsdam, Germany.Identification of protein complexes from protein-protein interaction (PPI) networks is a key problem in PPI mining, solved by parameter-dependent approaches that suffer from small recall rates. Here we introduce GCC-v, a family of efficient, parameter-free algorithms to accurately predict protein complexes using the (weighted) clustering coefficient of proteins in PPI networks. Through comparative analyses with gold standards and PPI networks from Escherichia coli, Saccharomyces cerevisiae, and Homo sapiens, we demonstrate that GCC-v outperforms twelve state-of-the-art approaches for identification of protein complexes with respect to twelve performance measures in at least 85.71% of scenarios. We also show that GCC-v results in the exact recovery of ∼35% of protein complexes in a pan-plant PPI network and discover 144 new protein complexes in Arabidopsis thaliana, with high support from GO semantic similarity. Our results indicate that findings from GCC-v are robust to network perturbations, which has direct implications to assess the impact of the PPI network quality on the predicted protein complexes.http://www.sciencedirect.com/science/article/pii/S2001037021003998Protein complexesProtein-protein interactionNetwork clusteringSpecies comparison
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Sara Omranian Angela Angeleska Zoran Nikoloski
spellingShingle	Sara Omranian Angela Angeleska Zoran Nikoloski Efficient and accurate identification of protein complexes from protein-protein interaction networks based on the clustering coefficient Computational and Structural Biotechnology Journal Protein complexes Protein-protein interaction Network clustering Species comparison
author_facet	Sara Omranian Angela Angeleska Zoran Nikoloski
author_sort	Sara Omranian
title	Efficient and accurate identification of protein complexes from protein-protein interaction networks based on the clustering coefficient
title_short	Efficient and accurate identification of protein complexes from protein-protein interaction networks based on the clustering coefficient
title_full	Efficient and accurate identification of protein complexes from protein-protein interaction networks based on the clustering coefficient
title_fullStr	Efficient and accurate identification of protein complexes from protein-protein interaction networks based on the clustering coefficient
title_full_unstemmed	Efficient and accurate identification of protein complexes from protein-protein interaction networks based on the clustering coefficient
title_sort	efficient and accurate identification of protein complexes from protein-protein interaction networks based on the clustering coefficient
publisher	Elsevier
series	Computational and Structural Biotechnology Journal
issn	2001-0370
publishDate	2021-01-01
description	Identification of protein complexes from protein-protein interaction (PPI) networks is a key problem in PPI mining, solved by parameter-dependent approaches that suffer from small recall rates. Here we introduce GCC-v, a family of efficient, parameter-free algorithms to accurately predict protein complexes using the (weighted) clustering coefficient of proteins in PPI networks. Through comparative analyses with gold standards and PPI networks from Escherichia coli, Saccharomyces cerevisiae, and Homo sapiens, we demonstrate that GCC-v outperforms twelve state-of-the-art approaches for identification of protein complexes with respect to twelve performance measures in at least 85.71% of scenarios. We also show that GCC-v results in the exact recovery of ∼35% of protein complexes in a pan-plant PPI network and discover 144 new protein complexes in Arabidopsis thaliana, with high support from GO semantic similarity. Our results indicate that findings from GCC-v are robust to network perturbations, which has direct implications to assess the impact of the PPI network quality on the predicted protein complexes.
topic	Protein complexes Protein-protein interaction Network clustering Species comparison
url	http://www.sciencedirect.com/science/article/pii/S2001037021003998
work_keys_str_mv	AT saraomranian efficientandaccurateidentificationofproteincomplexesfromproteinproteininteractionnetworksbasedontheclusteringcoefficient AT angelaangeleska efficientandaccurateidentificationofproteincomplexesfromproteinproteininteractionnetworksbasedontheclusteringcoefficient AT zorannikoloski efficientandaccurateidentificationofproteincomplexesfromproteinproteininteractionnetworksbasedontheclusteringcoefficient
_version_	1716867288200642560

Efficient and accurate identification of protein complexes from protein-protein interaction networks based on the clustering coefficient

Similar Items