Variable Selection in Boosting

碩士 === 國立東華大學 === 應用數學系 === 92 === Boosting is one of the successful ensemble classifiers. It attracts much attention recently because its impressive empirical performances and less understood theoretical properties. In this study, we consider the issue of variable selection under the boos...

Full description

Bibliographic Details
Main Authors: Yi-mo Tasi, 蔡易牟
Other Authors: Chen-Hai Andy Tsao
Format: Others
Language:zh-TW
Published: 2004
Online Access:http://ndltd.ncl.edu.tw/handle/47748853554906680421
Description
Summary:碩士 === 國立東華大學 === 應用數學系 === 92 === Boosting is one of the successful ensemble classifiers. It attracts much attention recently because its impressive empirical performances and less understood theoretical properties. In this study, we consider the issue of variable selection under the boosting scheme. We study the effectiveness of principle component analysis (PCA), the significance-based selection method and their hybrids as methods of selection. Under multivariate normal model, we found that PCA-SIG (PCA then Significance-based) methods outperform PCA methods and are comparable with SIG-PCA methods. Similar yet more marked phenomena are observed for NBA 2002-2003 (Spurs vs. Nets) box-score data analysis. Using the box-scores from 'Spurs-like' and 'Nets-like' teams, we consider the winner prediction as a (supervised) learning problem. We note that boosting with PCA-SIG achieves satisfactory error rates and outperforms other methods.