Integrating data and knowledge to identify functional modules of genes: a multilayer approach
Abstract Background Characterizing the modular structure of cellular network is an important way to identify novel genes for targeted therapeutics. This is made possible by the rising of high-throughput technology. Unfortunately, computational methods to identify functional modules were limited by t...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2019-05-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s12859-019-2800-y |
id |
doaj-d45cda6f9cf24e3a985e3f3f0819864f |
---|---|
record_format |
Article |
spelling |
doaj-d45cda6f9cf24e3a985e3f3f0819864f2020-11-25T02:38:16ZengBMCBMC Bioinformatics1471-21052019-05-0120111510.1186/s12859-019-2800-yIntegrating data and knowledge to identify functional modules of genes: a multilayer approachLifan Liang0Vicky Chen1Kunju Zhu2Xiaonan Fan3Xinghua Lu4Songjian Lu5Department of Biomedical Informatics, University of PittsburghDepartment of Biomedical Informatics, University of PittsburghDepartment of Biomedical Informatics, University of PittsburghDepartment of Biomedical Informatics, University of PittsburghDepartment of Biomedical Informatics, University of PittsburghDepartment of Biomedical Informatics, University of PittsburghAbstract Background Characterizing the modular structure of cellular network is an important way to identify novel genes for targeted therapeutics. This is made possible by the rising of high-throughput technology. Unfortunately, computational methods to identify functional modules were limited by the data quality issues of high-throughput techniques. This study aims to integrate knowledge extracted from literature to further improve the accuracy of functional module identification. Results Our new model and algorithm were applied to both yeast and human interactomes. Predicted functional modules have covered over 90% of the proteins in both organisms, while maintaining a comparable overall accuracy. We found that the combination of both mRNA expression information and biomedical knowledge greatly improved the performance of functional module identification, which is better than those only using protein interaction network weighted with transcriptomic data, literature knowledge, or simply unweighted protein interaction network. Our new algorithm also achieved better performance when comparing with some other well-known methods, especially in terms of the positive predictive value (PPV), which indicated the confidence of novel discovery. Conclusion Higher PPV with the multiplex approach suggested that information from both sources has been effectively integrated to reduce false positive. With protein coverage higher than 90%, our algorithm is able to generate more novel biological hypothesis with higher confidence.http://link.springer.com/article/10.1186/s12859-019-2800-yProtein-protein interactionGraph clusteringRandom walkMultiplexTopic modelingGene expression |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Lifan Liang Vicky Chen Kunju Zhu Xiaonan Fan Xinghua Lu Songjian Lu |
spellingShingle |
Lifan Liang Vicky Chen Kunju Zhu Xiaonan Fan Xinghua Lu Songjian Lu Integrating data and knowledge to identify functional modules of genes: a multilayer approach BMC Bioinformatics Protein-protein interaction Graph clustering Random walk Multiplex Topic modeling Gene expression |
author_facet |
Lifan Liang Vicky Chen Kunju Zhu Xiaonan Fan Xinghua Lu Songjian Lu |
author_sort |
Lifan Liang |
title |
Integrating data and knowledge to identify functional modules of genes: a multilayer approach |
title_short |
Integrating data and knowledge to identify functional modules of genes: a multilayer approach |
title_full |
Integrating data and knowledge to identify functional modules of genes: a multilayer approach |
title_fullStr |
Integrating data and knowledge to identify functional modules of genes: a multilayer approach |
title_full_unstemmed |
Integrating data and knowledge to identify functional modules of genes: a multilayer approach |
title_sort |
integrating data and knowledge to identify functional modules of genes: a multilayer approach |
publisher |
BMC |
series |
BMC Bioinformatics |
issn |
1471-2105 |
publishDate |
2019-05-01 |
description |
Abstract Background Characterizing the modular structure of cellular network is an important way to identify novel genes for targeted therapeutics. This is made possible by the rising of high-throughput technology. Unfortunately, computational methods to identify functional modules were limited by the data quality issues of high-throughput techniques. This study aims to integrate knowledge extracted from literature to further improve the accuracy of functional module identification. Results Our new model and algorithm were applied to both yeast and human interactomes. Predicted functional modules have covered over 90% of the proteins in both organisms, while maintaining a comparable overall accuracy. We found that the combination of both mRNA expression information and biomedical knowledge greatly improved the performance of functional module identification, which is better than those only using protein interaction network weighted with transcriptomic data, literature knowledge, or simply unweighted protein interaction network. Our new algorithm also achieved better performance when comparing with some other well-known methods, especially in terms of the positive predictive value (PPV), which indicated the confidence of novel discovery. Conclusion Higher PPV with the multiplex approach suggested that information from both sources has been effectively integrated to reduce false positive. With protein coverage higher than 90%, our algorithm is able to generate more novel biological hypothesis with higher confidence. |
topic |
Protein-protein interaction Graph clustering Random walk Multiplex Topic modeling Gene expression |
url |
http://link.springer.com/article/10.1186/s12859-019-2800-y |
work_keys_str_mv |
AT lifanliang integratingdataandknowledgetoidentifyfunctionalmodulesofgenesamultilayerapproach AT vickychen integratingdataandknowledgetoidentifyfunctionalmodulesofgenesamultilayerapproach AT kunjuzhu integratingdataandknowledgetoidentifyfunctionalmodulesofgenesamultilayerapproach AT xiaonanfan integratingdataandknowledgetoidentifyfunctionalmodulesofgenesamultilayerapproach AT xinghualu integratingdataandknowledgetoidentifyfunctionalmodulesofgenesamultilayerapproach AT songjianlu integratingdataandknowledgetoidentifyfunctionalmodulesofgenesamultilayerapproach |
_version_ |
1724791702930063360 |