Summary: | 碩士 === 國立中正大學 === 資訊工程研究所 === 99 === Researches on the construction of hierarchical architecture in document classification usually focus on the adjustment of hierarchy and commonly use the global information for preprocessing. Most of them build the new architecture without considering the changing of local site during the adjustments, and restrain the effects to the local information in the preprocessing. For solving the problem, we have proposed two methods for adjusting the traditional methods, TOP Local Feature for feature extraction and Level Term Weighting for term weighting. By using the inside-class information from every single category, both of above methods emphasize the value of the local information and provide a more suitable adjustment for the structure at this time. Finally, we use the information of TOP Local Feature to represent as a small snapshot of each node and help to adjust the classification architecture from a flat model into a hierarchical one. Classifiers are trained by the new hierarchical architecture and get a better performance.
Empirical evaluations on real-world data sets show that using local information could obtain improvement on the macro-averaged F-Measure. The new hierarchical architecture based on our proposed method is improved about 14% of the best performance on F-Measure.
|