Rule based classifier for the analysis of gene-gene and gene-environment interactions in genetic association studies

<p>Abstract</p> <p>Background</p> <p>Several methods have been presented for the analysis of complex interactions between genetic polymorphisms and/or environmental factors. Despite the available methods, there is still a need for alternative methods, because no single...

Full description

Bibliographic Details
Main Authors: Lehr Thorsten, Yuan Jing, Zeumer Dirk, Jayadev Supriya, Ritchie Marylyn D
Format: Article
Language:English
Published: BMC 2011-03-01
Series:BioData Mining
Online Access:http://www.biodatamining.org/content/4/1/4
id doaj-aa72845de4e64dab8e3c6316d28c13ca
record_format Article
spelling doaj-aa72845de4e64dab8e3c6316d28c13ca2020-11-24T21:19:56ZengBMCBioData Mining1756-03812011-03-0141410.1186/1756-0381-4-4Rule based classifier for the analysis of gene-gene and gene-environment interactions in genetic association studiesLehr ThorstenYuan JingZeumer DirkJayadev SupriyaRitchie Marylyn D<p>Abstract</p> <p>Background</p> <p>Several methods have been presented for the analysis of complex interactions between genetic polymorphisms and/or environmental factors. Despite the available methods, there is still a need for alternative methods, because no single method will perform well in all scenarios. The aim of this work was to evaluate the performance of three selected rule based classifier algorithms, RIPPER, RIDOR and PART, for the analysis of genetic association studies.</p> <p>Methods</p> <p>Overall, 42 datasets were simulated with three different case-control models, a varying number of subjects (300, 600), SNPs (500, 1500, 3000) and noise (5%, 10%, 20%). The algorithms were applied to each of the datasets with a set of algorithm-specific settings. Results were further investigated with respect to a) the Model, b) the Rules, and c) the Attribute level. Data analysis was performed using WEKA, SAS and PERL.</p> <p>Results</p> <p>The RIPPER algorithm discovered the true case-control model at least once in >33% of the datasets. The RIDOR and PART algorithm performed poorly for model detection. The RIPPER, RIDOR and PART algorithm discovered the true case-control rules in more than 83%, 83% and 44% of the datasets, respectively. All three algorithms were able to detect the attributes utilized in the respective case-control models in most datasets.</p> <p>Conclusions</p> <p>The current analyses substantiate the utility of rule based classifiers such as RIPPER, RIDOR and PART for the detection of gene-gene/gene-environment interactions in genetic association studies. These classifiers could provide a valuable new method, complementing existing approaches, in the analysis of genetic association studies. The methods provide an advantage in being able to handle both categorical and continuous variable types. Further, because the outputs of the analyses are easy to interpret, the rule based classifier approach could quickly generate testable hypotheses for additional evaluation. Since the algorithms are computationally inexpensive, they may serve as valuable tools for preselection of attributes to be used in more complex, computationally intensive approaches. Whether used in isolation or in conjunction with other tools, rule based classifiers are an important addition to the armamentarium of tools available for analyses of complex genetic association studies.</p> http://www.biodatamining.org/content/4/1/4
collection DOAJ
language English
format Article
sources DOAJ
author Lehr Thorsten
Yuan Jing
Zeumer Dirk
Jayadev Supriya
Ritchie Marylyn D
spellingShingle Lehr Thorsten
Yuan Jing
Zeumer Dirk
Jayadev Supriya
Ritchie Marylyn D
Rule based classifier for the analysis of gene-gene and gene-environment interactions in genetic association studies
BioData Mining
author_facet Lehr Thorsten
Yuan Jing
Zeumer Dirk
Jayadev Supriya
Ritchie Marylyn D
author_sort Lehr Thorsten
title Rule based classifier for the analysis of gene-gene and gene-environment interactions in genetic association studies
title_short Rule based classifier for the analysis of gene-gene and gene-environment interactions in genetic association studies
title_full Rule based classifier for the analysis of gene-gene and gene-environment interactions in genetic association studies
title_fullStr Rule based classifier for the analysis of gene-gene and gene-environment interactions in genetic association studies
title_full_unstemmed Rule based classifier for the analysis of gene-gene and gene-environment interactions in genetic association studies
title_sort rule based classifier for the analysis of gene-gene and gene-environment interactions in genetic association studies
publisher BMC
series BioData Mining
issn 1756-0381
publishDate 2011-03-01
description <p>Abstract</p> <p>Background</p> <p>Several methods have been presented for the analysis of complex interactions between genetic polymorphisms and/or environmental factors. Despite the available methods, there is still a need for alternative methods, because no single method will perform well in all scenarios. The aim of this work was to evaluate the performance of three selected rule based classifier algorithms, RIPPER, RIDOR and PART, for the analysis of genetic association studies.</p> <p>Methods</p> <p>Overall, 42 datasets were simulated with three different case-control models, a varying number of subjects (300, 600), SNPs (500, 1500, 3000) and noise (5%, 10%, 20%). The algorithms were applied to each of the datasets with a set of algorithm-specific settings. Results were further investigated with respect to a) the Model, b) the Rules, and c) the Attribute level. Data analysis was performed using WEKA, SAS and PERL.</p> <p>Results</p> <p>The RIPPER algorithm discovered the true case-control model at least once in >33% of the datasets. The RIDOR and PART algorithm performed poorly for model detection. The RIPPER, RIDOR and PART algorithm discovered the true case-control rules in more than 83%, 83% and 44% of the datasets, respectively. All three algorithms were able to detect the attributes utilized in the respective case-control models in most datasets.</p> <p>Conclusions</p> <p>The current analyses substantiate the utility of rule based classifiers such as RIPPER, RIDOR and PART for the detection of gene-gene/gene-environment interactions in genetic association studies. These classifiers could provide a valuable new method, complementing existing approaches, in the analysis of genetic association studies. The methods provide an advantage in being able to handle both categorical and continuous variable types. Further, because the outputs of the analyses are easy to interpret, the rule based classifier approach could quickly generate testable hypotheses for additional evaluation. Since the algorithms are computationally inexpensive, they may serve as valuable tools for preselection of attributes to be used in more complex, computationally intensive approaches. Whether used in isolation or in conjunction with other tools, rule based classifiers are an important addition to the armamentarium of tools available for analyses of complex genetic association studies.</p>
url http://www.biodatamining.org/content/4/1/4
work_keys_str_mv AT lehrthorsten rulebasedclassifierfortheanalysisofgenegeneandgeneenvironmentinteractionsingeneticassociationstudies
AT yuanjing rulebasedclassifierfortheanalysisofgenegeneandgeneenvironmentinteractionsingeneticassociationstudies
AT zeumerdirk rulebasedclassifierfortheanalysisofgenegeneandgeneenvironmentinteractionsingeneticassociationstudies
AT jayadevsupriya rulebasedclassifierfortheanalysisofgenegeneandgeneenvironmentinteractionsingeneticassociationstudies
AT ritchiemarylynd rulebasedclassifierfortheanalysisofgenegeneandgeneenvironmentinteractionsingeneticassociationstudies
_version_ 1726004579426893824