CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets.
In recent years, there has been a huge rise in the number of publicly available transcriptional profiling datasets. These massive compendia comprise billions of measurements and provide a special opportunity to predict the function of unstudied genes based on co-expression to well-studied pathways....
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2017-07-01
|
Series: | PLoS Computational Biology |
Online Access: | http://europepmc.org/articles/PMC5546725?pdf=render |
id |
doaj-71bcd98defde41989dde88c9fe19246f |
---|---|
record_format |
Article |
spelling |
doaj-71bcd98defde41989dde88c9fe19246f2020-11-25T01:44:26ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582017-07-01137e100565310.1371/journal.pcbi.1005653CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets.Yang LiAlexis A JourdainSarah E CalvoJun S LiuVamsi K MoothaIn recent years, there has been a huge rise in the number of publicly available transcriptional profiling datasets. These massive compendia comprise billions of measurements and provide a special opportunity to predict the function of unstudied genes based on co-expression to well-studied pathways. Such analyses can be very challenging, however, since biological pathways are modular and may exhibit co-expression only in specific contexts. To overcome these challenges we introduce CLIC, CLustering by Inferred Co-expression. CLIC accepts as input a pathway consisting of two or more genes. It then uses a Bayesian partition model to simultaneously partition the input gene set into coherent co-expressed modules (CEMs), while assigning the posterior probability for each dataset in support of each CEM. CLIC then expands each CEM by scanning the transcriptome for additional co-expressed genes, quantified by an integrated log-likelihood ratio (LLR) score weighted for each dataset. As a byproduct, CLIC automatically learns the conditions (datasets) within which a CEM is operative. We implemented CLIC using a compendium of 1774 mouse microarray datasets (28628 microarrays) or 1887 human microarray datasets (45158 microarrays). CLIC analysis reveals that of 910 canonical biological pathways, 30% consist of strongly co-expressed gene modules for which new members are predicted. For example, CLIC predicts a functional connection between protein C7orf55 (FMC1) and the mitochondrial ATP synthase complex that we have experimentally validated. CLIC is freely available at www.gene-clic.org. We anticipate that CLIC will be valuable both for revealing new components of biological pathways as well as the conditions in which they are active.http://europepmc.org/articles/PMC5546725?pdf=render |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Yang Li Alexis A Jourdain Sarah E Calvo Jun S Liu Vamsi K Mootha |
spellingShingle |
Yang Li Alexis A Jourdain Sarah E Calvo Jun S Liu Vamsi K Mootha CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets. PLoS Computational Biology |
author_facet |
Yang Li Alexis A Jourdain Sarah E Calvo Jun S Liu Vamsi K Mootha |
author_sort |
Yang Li |
title |
CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets. |
title_short |
CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets. |
title_full |
CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets. |
title_fullStr |
CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets. |
title_full_unstemmed |
CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets. |
title_sort |
clic, a tool for expanding biological pathways based on co-expression across thousands of datasets. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS Computational Biology |
issn |
1553-734X 1553-7358 |
publishDate |
2017-07-01 |
description |
In recent years, there has been a huge rise in the number of publicly available transcriptional profiling datasets. These massive compendia comprise billions of measurements and provide a special opportunity to predict the function of unstudied genes based on co-expression to well-studied pathways. Such analyses can be very challenging, however, since biological pathways are modular and may exhibit co-expression only in specific contexts. To overcome these challenges we introduce CLIC, CLustering by Inferred Co-expression. CLIC accepts as input a pathway consisting of two or more genes. It then uses a Bayesian partition model to simultaneously partition the input gene set into coherent co-expressed modules (CEMs), while assigning the posterior probability for each dataset in support of each CEM. CLIC then expands each CEM by scanning the transcriptome for additional co-expressed genes, quantified by an integrated log-likelihood ratio (LLR) score weighted for each dataset. As a byproduct, CLIC automatically learns the conditions (datasets) within which a CEM is operative. We implemented CLIC using a compendium of 1774 mouse microarray datasets (28628 microarrays) or 1887 human microarray datasets (45158 microarrays). CLIC analysis reveals that of 910 canonical biological pathways, 30% consist of strongly co-expressed gene modules for which new members are predicted. For example, CLIC predicts a functional connection between protein C7orf55 (FMC1) and the mitochondrial ATP synthase complex that we have experimentally validated. CLIC is freely available at www.gene-clic.org. We anticipate that CLIC will be valuable both for revealing new components of biological pathways as well as the conditions in which they are active. |
url |
http://europepmc.org/articles/PMC5546725?pdf=render |
work_keys_str_mv |
AT yangli clicatoolforexpandingbiologicalpathwaysbasedoncoexpressionacrossthousandsofdatasets AT alexisajourdain clicatoolforexpandingbiologicalpathwaysbasedoncoexpressionacrossthousandsofdatasets AT sarahecalvo clicatoolforexpandingbiologicalpathwaysbasedoncoexpressionacrossthousandsofdatasets AT junsliu clicatoolforexpandingbiologicalpathwaysbasedoncoexpressionacrossthousandsofdatasets AT vamsikmootha clicatoolforexpandingbiologicalpathwaysbasedoncoexpressionacrossthousandsofdatasets |
_version_ |
1725028697584435200 |