CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets.

In recent years, there has been a huge rise in the number of publicly available transcriptional profiling datasets. These massive compendia comprise billions of measurements and provide a special opportunity to predict the function of unstudied genes based on co-expression to well-studied pathways....

Full description

Bibliographic Details
Main Authors: Yang Li, Alexis A Jourdain, Sarah E Calvo, Jun S Liu, Vamsi K Mootha
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2017-07-01
Series:PLoS Computational Biology
Online Access:http://europepmc.org/articles/PMC5546725?pdf=render
id doaj-71bcd98defde41989dde88c9fe19246f
record_format Article
spelling doaj-71bcd98defde41989dde88c9fe19246f2020-11-25T01:44:26ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582017-07-01137e100565310.1371/journal.pcbi.1005653CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets.Yang LiAlexis A JourdainSarah E CalvoJun S LiuVamsi K MoothaIn recent years, there has been a huge rise in the number of publicly available transcriptional profiling datasets. These massive compendia comprise billions of measurements and provide a special opportunity to predict the function of unstudied genes based on co-expression to well-studied pathways. Such analyses can be very challenging, however, since biological pathways are modular and may exhibit co-expression only in specific contexts. To overcome these challenges we introduce CLIC, CLustering by Inferred Co-expression. CLIC accepts as input a pathway consisting of two or more genes. It then uses a Bayesian partition model to simultaneously partition the input gene set into coherent co-expressed modules (CEMs), while assigning the posterior probability for each dataset in support of each CEM. CLIC then expands each CEM by scanning the transcriptome for additional co-expressed genes, quantified by an integrated log-likelihood ratio (LLR) score weighted for each dataset. As a byproduct, CLIC automatically learns the conditions (datasets) within which a CEM is operative. We implemented CLIC using a compendium of 1774 mouse microarray datasets (28628 microarrays) or 1887 human microarray datasets (45158 microarrays). CLIC analysis reveals that of 910 canonical biological pathways, 30% consist of strongly co-expressed gene modules for which new members are predicted. For example, CLIC predicts a functional connection between protein C7orf55 (FMC1) and the mitochondrial ATP synthase complex that we have experimentally validated. CLIC is freely available at www.gene-clic.org. We anticipate that CLIC will be valuable both for revealing new components of biological pathways as well as the conditions in which they are active.http://europepmc.org/articles/PMC5546725?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Yang Li
Alexis A Jourdain
Sarah E Calvo
Jun S Liu
Vamsi K Mootha
spellingShingle Yang Li
Alexis A Jourdain
Sarah E Calvo
Jun S Liu
Vamsi K Mootha
CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets.
PLoS Computational Biology
author_facet Yang Li
Alexis A Jourdain
Sarah E Calvo
Jun S Liu
Vamsi K Mootha
author_sort Yang Li
title CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets.
title_short CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets.
title_full CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets.
title_fullStr CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets.
title_full_unstemmed CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets.
title_sort clic, a tool for expanding biological pathways based on co-expression across thousands of datasets.
publisher Public Library of Science (PLoS)
series PLoS Computational Biology
issn 1553-734X
1553-7358
publishDate 2017-07-01
description In recent years, there has been a huge rise in the number of publicly available transcriptional profiling datasets. These massive compendia comprise billions of measurements and provide a special opportunity to predict the function of unstudied genes based on co-expression to well-studied pathways. Such analyses can be very challenging, however, since biological pathways are modular and may exhibit co-expression only in specific contexts. To overcome these challenges we introduce CLIC, CLustering by Inferred Co-expression. CLIC accepts as input a pathway consisting of two or more genes. It then uses a Bayesian partition model to simultaneously partition the input gene set into coherent co-expressed modules (CEMs), while assigning the posterior probability for each dataset in support of each CEM. CLIC then expands each CEM by scanning the transcriptome for additional co-expressed genes, quantified by an integrated log-likelihood ratio (LLR) score weighted for each dataset. As a byproduct, CLIC automatically learns the conditions (datasets) within which a CEM is operative. We implemented CLIC using a compendium of 1774 mouse microarray datasets (28628 microarrays) or 1887 human microarray datasets (45158 microarrays). CLIC analysis reveals that of 910 canonical biological pathways, 30% consist of strongly co-expressed gene modules for which new members are predicted. For example, CLIC predicts a functional connection between protein C7orf55 (FMC1) and the mitochondrial ATP synthase complex that we have experimentally validated. CLIC is freely available at www.gene-clic.org. We anticipate that CLIC will be valuable both for revealing new components of biological pathways as well as the conditions in which they are active.
url http://europepmc.org/articles/PMC5546725?pdf=render
work_keys_str_mv AT yangli clicatoolforexpandingbiologicalpathwaysbasedoncoexpressionacrossthousandsofdatasets
AT alexisajourdain clicatoolforexpandingbiologicalpathwaysbasedoncoexpressionacrossthousandsofdatasets
AT sarahecalvo clicatoolforexpandingbiologicalpathwaysbasedoncoexpressionacrossthousandsofdatasets
AT junsliu clicatoolforexpandingbiologicalpathwaysbasedoncoexpressionacrossthousandsofdatasets
AT vamsikmootha clicatoolforexpandingbiologicalpathwaysbasedoncoexpressionacrossthousandsofdatasets
_version_ 1725028697584435200