Detecting purely epistatic multi-locus interactions by an omnibus permutation test on ensembles of two-locus analyses

<p>Abstract</p> <p>Background</p> <p>Purely epistatic multi-locus interactions cannot generally be detected via single-locus analysis in case-control studies of complex diseases. Recently, many two-locus and multi-locus analysis techniques have been shown to be promisin...

Full description

Bibliographic Details
Main Authors: Limwongse Chanin, Sinsomros Saravudh, Piroonratana Theera, Assawamakin Anunchai, Wongseree Waranyu, Chaiyaratana Nachol
Format: Article
Language:English
Published: BMC 2009-09-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/10/294
id doaj-adb3d9c72af64b64a40771ce159fa815
record_format Article
spelling doaj-adb3d9c72af64b64a40771ce159fa8152020-11-24T21:12:48ZengBMCBMC Bioinformatics1471-21052009-09-0110129410.1186/1471-2105-10-294Detecting purely epistatic multi-locus interactions by an omnibus permutation test on ensembles of two-locus analysesLimwongse ChaninSinsomros SaravudhPiroonratana TheeraAssawamakin AnunchaiWongseree WaranyuChaiyaratana Nachol<p>Abstract</p> <p>Background</p> <p>Purely epistatic multi-locus interactions cannot generally be detected via single-locus analysis in case-control studies of complex diseases. Recently, many two-locus and multi-locus analysis techniques have been shown to be promising for the epistasis detection. However, exhaustive multi-locus analysis requires prohibitively large computational efforts when problems involve large-scale or genome-wide data. Furthermore, there is no explicit proof that a combination of multiple two-locus analyses can lead to the correct identification of multi-locus interactions.</p> <p>Results</p> <p>The proposed 2LOmb algorithm performs an omnibus permutation test on ensembles of two-locus analyses. The algorithm consists of four main steps: two-locus analysis, a permutation test, global <it>p</it>-value determination and a progressive search for the best ensemble. 2LOmb is benchmarked against an exhaustive two-locus analysis technique, a set association approach, a correlation-based feature selection (CFS) technique and a tuned ReliefF (TuRF) technique. The simulation results indicate that 2LOmb produces a low false-positive error. Moreover, 2LOmb has the best performance in terms of an ability to identify all causative single nucleotide polymorphisms (SNPs) and a low number of output SNPs in purely epistatic two-, three- and four-locus interaction problems. The interaction models constructed from the 2LOmb outputs via a multifactor dimensionality reduction (MDR) method are also included for the confirmation of epistasis detection. 2LOmb is subsequently applied to a type 2 diabetes mellitus (T2D) data set, which is obtained as a part of the UK genome-wide genetic epidemiology study by the Wellcome Trust Case Control Consortium (WTCCC). After primarily screening for SNPs that locate within or near 372 candidate genes and exhibit no marginal single-locus effects, the T2D data set is reduced to 7,065 SNPs from 370 genes. The 2LOmb search in the reduced T2D data reveals that four intronic SNPs in <it>PGM1 </it>(phosphoglucomutase 1), two intronic SNPs in <it>LMX1A </it>(LIM homeobox transcription factor 1, alpha), two intronic SNPs in <it>PARK2 </it>(Parkinson disease (autosomal recessive, juvenile) 2, parkin) and three intronic SNPs in <it>GYS2 </it>(glycogen synthase 2 (liver)) are associated with the disease. The 2LOmb result suggests that there is no interaction between each pair of the identified genes that can be described by purely epistatic two-locus interaction models. Moreover, there are no interactions between these four genes that can be described by purely epistatic multi-locus interaction models with marginal two-locus effects. The findings provide an alternative explanation for the aetiology of T2D in a UK population.</p> <p>Conclusion</p> <p>An omnibus permutation test on ensembles of two-locus analyses can detect purely epistatic multi-locus interactions with marginal two-locus effects. The study also reveals that SNPs from large-scale or genome-wide case-control data which are discarded after single-locus analysis detects no association can still be useful for genetic epidemiology studies.</p> http://www.biomedcentral.com/1471-2105/10/294
collection DOAJ
language English
format Article
sources DOAJ
author Limwongse Chanin
Sinsomros Saravudh
Piroonratana Theera
Assawamakin Anunchai
Wongseree Waranyu
Chaiyaratana Nachol
spellingShingle Limwongse Chanin
Sinsomros Saravudh
Piroonratana Theera
Assawamakin Anunchai
Wongseree Waranyu
Chaiyaratana Nachol
Detecting purely epistatic multi-locus interactions by an omnibus permutation test on ensembles of two-locus analyses
BMC Bioinformatics
author_facet Limwongse Chanin
Sinsomros Saravudh
Piroonratana Theera
Assawamakin Anunchai
Wongseree Waranyu
Chaiyaratana Nachol
author_sort Limwongse Chanin
title Detecting purely epistatic multi-locus interactions by an omnibus permutation test on ensembles of two-locus analyses
title_short Detecting purely epistatic multi-locus interactions by an omnibus permutation test on ensembles of two-locus analyses
title_full Detecting purely epistatic multi-locus interactions by an omnibus permutation test on ensembles of two-locus analyses
title_fullStr Detecting purely epistatic multi-locus interactions by an omnibus permutation test on ensembles of two-locus analyses
title_full_unstemmed Detecting purely epistatic multi-locus interactions by an omnibus permutation test on ensembles of two-locus analyses
title_sort detecting purely epistatic multi-locus interactions by an omnibus permutation test on ensembles of two-locus analyses
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2009-09-01
description <p>Abstract</p> <p>Background</p> <p>Purely epistatic multi-locus interactions cannot generally be detected via single-locus analysis in case-control studies of complex diseases. Recently, many two-locus and multi-locus analysis techniques have been shown to be promising for the epistasis detection. However, exhaustive multi-locus analysis requires prohibitively large computational efforts when problems involve large-scale or genome-wide data. Furthermore, there is no explicit proof that a combination of multiple two-locus analyses can lead to the correct identification of multi-locus interactions.</p> <p>Results</p> <p>The proposed 2LOmb algorithm performs an omnibus permutation test on ensembles of two-locus analyses. The algorithm consists of four main steps: two-locus analysis, a permutation test, global <it>p</it>-value determination and a progressive search for the best ensemble. 2LOmb is benchmarked against an exhaustive two-locus analysis technique, a set association approach, a correlation-based feature selection (CFS) technique and a tuned ReliefF (TuRF) technique. The simulation results indicate that 2LOmb produces a low false-positive error. Moreover, 2LOmb has the best performance in terms of an ability to identify all causative single nucleotide polymorphisms (SNPs) and a low number of output SNPs in purely epistatic two-, three- and four-locus interaction problems. The interaction models constructed from the 2LOmb outputs via a multifactor dimensionality reduction (MDR) method are also included for the confirmation of epistasis detection. 2LOmb is subsequently applied to a type 2 diabetes mellitus (T2D) data set, which is obtained as a part of the UK genome-wide genetic epidemiology study by the Wellcome Trust Case Control Consortium (WTCCC). After primarily screening for SNPs that locate within or near 372 candidate genes and exhibit no marginal single-locus effects, the T2D data set is reduced to 7,065 SNPs from 370 genes. The 2LOmb search in the reduced T2D data reveals that four intronic SNPs in <it>PGM1 </it>(phosphoglucomutase 1), two intronic SNPs in <it>LMX1A </it>(LIM homeobox transcription factor 1, alpha), two intronic SNPs in <it>PARK2 </it>(Parkinson disease (autosomal recessive, juvenile) 2, parkin) and three intronic SNPs in <it>GYS2 </it>(glycogen synthase 2 (liver)) are associated with the disease. The 2LOmb result suggests that there is no interaction between each pair of the identified genes that can be described by purely epistatic two-locus interaction models. Moreover, there are no interactions between these four genes that can be described by purely epistatic multi-locus interaction models with marginal two-locus effects. The findings provide an alternative explanation for the aetiology of T2D in a UK population.</p> <p>Conclusion</p> <p>An omnibus permutation test on ensembles of two-locus analyses can detect purely epistatic multi-locus interactions with marginal two-locus effects. The study also reveals that SNPs from large-scale or genome-wide case-control data which are discarded after single-locus analysis detects no association can still be useful for genetic epidemiology studies.</p>
url http://www.biomedcentral.com/1471-2105/10/294
work_keys_str_mv AT limwongsechanin detectingpurelyepistaticmultilocusinteractionsbyanomnibuspermutationtestonensemblesoftwolocusanalyses
AT sinsomrossaravudh detectingpurelyepistaticmultilocusinteractionsbyanomnibuspermutationtestonensemblesoftwolocusanalyses
AT piroonratanatheera detectingpurelyepistaticmultilocusinteractionsbyanomnibuspermutationtestonensemblesoftwolocusanalyses
AT assawamakinanunchai detectingpurelyepistaticmultilocusinteractionsbyanomnibuspermutationtestonensemblesoftwolocusanalyses
AT wongsereewaranyu detectingpurelyepistaticmultilocusinteractionsbyanomnibuspermutationtestonensemblesoftwolocusanalyses
AT chaiyaratananachol detectingpurelyepistaticmultilocusinteractionsbyanomnibuspermutationtestonensemblesoftwolocusanalyses
_version_ 1716749887883706368