adabag: An R Package for Classification with Boosting and Bagging

Boosting and bagging are two widely used ensemble methods for classification. Their common goal is to improve the accuracy of a classifier combining single classifiers which are slightly better than random guessing. Among the family of boosting algorithms, AdaBoost (adaptive boosting) is the best kn...

Full description

Bibliographic Details
Main Authors: Esteban Alfaro, Matias Gamez, Noelia García
Format: Article
Language:English
Published: Foundation for Open Access Statistics 2013-09-01
Series:Journal of Statistical Software
Online Access:http://www.jstatsoft.org/index.php/jss/article/view/2082
id doaj-7b5d72b4c97047cc824395f9a827b83f
record_format Article
spelling doaj-7b5d72b4c97047cc824395f9a827b83f2020-11-24T22:07:26ZengFoundation for Open Access StatisticsJournal of Statistical Software1548-76602013-09-0154113510.18637/jss.v054.i02686adabag: An R Package for Classification with Boosting and BaggingEsteban AlfaroMatias GamezNoelia GarcíaBoosting and bagging are two widely used ensemble methods for classification. Their common goal is to improve the accuracy of a classifier combining single classifiers which are slightly better than random guessing. Among the family of boosting algorithms, AdaBoost (adaptive boosting) is the best known, although it is suitable only for dichotomous tasks. AdaBoost.M1 and SAMME (stagewise additive modeling using a multi-class exponential loss function) are two easy and natural extensions to the general case of two or more classes. In this paper, the adabag R package is introduced. This version implements AdaBoost.M1, SAMME and bagging algorithms with classification trees as base classifiers. Once the ensembles have been trained, they can be used to predict the class of new samples. The accuracy of these classifiers can be estimated in a separated data set or through cross validation. Moreover, the evolution of the error as the ensemble grows can be analysed and the ensemble can be pruned. In addition, the margin in the class prediction and the probability of each class for the observations can be calculated. Finally, several classic examples in classification literature are shown to illustrate the use of this package.http://www.jstatsoft.org/index.php/jss/article/view/2082
collection DOAJ
language English
format Article
sources DOAJ
author Esteban Alfaro
Matias Gamez
Noelia García
spellingShingle Esteban Alfaro
Matias Gamez
Noelia García
adabag: An R Package for Classification with Boosting and Bagging
Journal of Statistical Software
author_facet Esteban Alfaro
Matias Gamez
Noelia García
author_sort Esteban Alfaro
title adabag: An R Package for Classification with Boosting and Bagging
title_short adabag: An R Package for Classification with Boosting and Bagging
title_full adabag: An R Package for Classification with Boosting and Bagging
title_fullStr adabag: An R Package for Classification with Boosting and Bagging
title_full_unstemmed adabag: An R Package for Classification with Boosting and Bagging
title_sort adabag: an r package for classification with boosting and bagging
publisher Foundation for Open Access Statistics
series Journal of Statistical Software
issn 1548-7660
publishDate 2013-09-01
description Boosting and bagging are two widely used ensemble methods for classification. Their common goal is to improve the accuracy of a classifier combining single classifiers which are slightly better than random guessing. Among the family of boosting algorithms, AdaBoost (adaptive boosting) is the best known, although it is suitable only for dichotomous tasks. AdaBoost.M1 and SAMME (stagewise additive modeling using a multi-class exponential loss function) are two easy and natural extensions to the general case of two or more classes. In this paper, the adabag R package is introduced. This version implements AdaBoost.M1, SAMME and bagging algorithms with classification trees as base classifiers. Once the ensembles have been trained, they can be used to predict the class of new samples. The accuracy of these classifiers can be estimated in a separated data set or through cross validation. Moreover, the evolution of the error as the ensemble grows can be analysed and the ensemble can be pruned. In addition, the margin in the class prediction and the probability of each class for the observations can be calculated. Finally, several classic examples in classification literature are shown to illustrate the use of this package.
url http://www.jstatsoft.org/index.php/jss/article/view/2082
work_keys_str_mv AT estebanalfaro adabaganrpackageforclassificationwithboostingandbagging
AT matiasgamez adabaganrpackageforclassificationwithboostingandbagging
AT noeliagarcia adabaganrpackageforclassificationwithboostingandbagging
_version_ 1725820435800522752