Summary: | Random forest (RF) is an ensemble classifier method, all decision trees participate in voting, some low-quality decision trees will reduce the accuracy of random forest. To improve the accuracy of random forest, decision trees with larger degree of diversity and higher classification accuracy are selected for voting. In this paper, the RF based on Kappa measure and the improved binary artificial bee colony algorithm (IBABC) are proposed. Firstly, Kappa measure is used for pre-pruning, and the decision trees with larger degree of diversity are selected from the forest. Then, the crossover operator and leaping operator are applied in ABC, and the improved binary ABC is used for secondary pruning, and the decision trees with better performance are selected for voting. The proposed method (Kappa+IBABC) are tested on a quantity of UCI datasets. Computational results demonstrate that Kappa+IBABC improves the performance on most datasets with fewer decision trees. The Wilcoxon signed-rank test is used to verify the significant difference between the Kappa+IBABC method and other pruning methods. In addition, Chinese haze pollution is becoming more and more serious. This proposed method is used to predict haze weather and has achieved good results.
|