Multiple-input multiple-output causal strategies for gene selection
<p>Abstract</p> <p>Background</p> <p>Traditional strategies for selecting variables in high dimensional classification problems aim to find sets of maximally relevant variables able to explain the target variations. If these techniques may be effective in generalization...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2011-11-01
|
Series: | BMC Bioinformatics |
Online Access: | http://www.biomedcentral.com/1471-2105/12/458 |
id |
doaj-a285955455a9486090b8ed8bcc33b529 |
---|---|
record_format |
Article |
spelling |
doaj-a285955455a9486090b8ed8bcc33b5292020-11-25T02:28:17ZengBMCBMC Bioinformatics1471-21052011-11-0112145810.1186/1471-2105-12-458Multiple-input multiple-output causal strategies for gene selectionBontempi GianlucaHaibe-Kains BenjaminDesmedt ChristineSotiriou ChristosQuackenbush John<p>Abstract</p> <p>Background</p> <p>Traditional strategies for selecting variables in high dimensional classification problems aim to find sets of maximally relevant variables able to explain the target variations. If these techniques may be effective in generalization accuracy they often do not reveal direct causes. The latter is essentially related to the fact that high correlation (or relevance) does not imply causation. In this study, we show how to efficiently incorporate causal information into gene selection by moving from a single-input single-output to a multiple-input multiple-output setting.</p> <p>Results</p> <p>We show in synthetic case study that a better prioritization of causal variables can be obtained by considering a relevance score which incorporates a causal term. In addition we show, in a meta-analysis study of six publicly available breast cancer microarray datasets, that the improvement occurs also in terms of accuracy. The biological interpretation of the results confirms the potential of a causal approach to gene selection.</p> <p>Conclusions</p> <p>Integrating causal information into gene selection algorithms is effective both in terms of prediction accuracy and biological interpretation.</p> http://www.biomedcentral.com/1471-2105/12/458 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Bontempi Gianluca Haibe-Kains Benjamin Desmedt Christine Sotiriou Christos Quackenbush John |
spellingShingle |
Bontempi Gianluca Haibe-Kains Benjamin Desmedt Christine Sotiriou Christos Quackenbush John Multiple-input multiple-output causal strategies for gene selection BMC Bioinformatics |
author_facet |
Bontempi Gianluca Haibe-Kains Benjamin Desmedt Christine Sotiriou Christos Quackenbush John |
author_sort |
Bontempi Gianluca |
title |
Multiple-input multiple-output causal strategies for gene selection |
title_short |
Multiple-input multiple-output causal strategies for gene selection |
title_full |
Multiple-input multiple-output causal strategies for gene selection |
title_fullStr |
Multiple-input multiple-output causal strategies for gene selection |
title_full_unstemmed |
Multiple-input multiple-output causal strategies for gene selection |
title_sort |
multiple-input multiple-output causal strategies for gene selection |
publisher |
BMC |
series |
BMC Bioinformatics |
issn |
1471-2105 |
publishDate |
2011-11-01 |
description |
<p>Abstract</p> <p>Background</p> <p>Traditional strategies for selecting variables in high dimensional classification problems aim to find sets of maximally relevant variables able to explain the target variations. If these techniques may be effective in generalization accuracy they often do not reveal direct causes. The latter is essentially related to the fact that high correlation (or relevance) does not imply causation. In this study, we show how to efficiently incorporate causal information into gene selection by moving from a single-input single-output to a multiple-input multiple-output setting.</p> <p>Results</p> <p>We show in synthetic case study that a better prioritization of causal variables can be obtained by considering a relevance score which incorporates a causal term. In addition we show, in a meta-analysis study of six publicly available breast cancer microarray datasets, that the improvement occurs also in terms of accuracy. The biological interpretation of the results confirms the potential of a causal approach to gene selection.</p> <p>Conclusions</p> <p>Integrating causal information into gene selection algorithms is effective both in terms of prediction accuracy and biological interpretation.</p> |
url |
http://www.biomedcentral.com/1471-2105/12/458 |
work_keys_str_mv |
AT bontempigianluca multipleinputmultipleoutputcausalstrategiesforgeneselection AT haibekainsbenjamin multipleinputmultipleoutputcausalstrategiesforgeneselection AT desmedtchristine multipleinputmultipleoutputcausalstrategiesforgeneselection AT sotiriouchristos multipleinputmultipleoutputcausalstrategiesforgeneselection AT quackenbushjohn multipleinputmultipleoutputcausalstrategiesforgeneselection |
_version_ |
1724839203426009088 |