Multi-label Text Categorization Using a Chi-Square Based Method

碩士 === 元智大學 === 資訊管理學系 === 96 === This study presents a based method to multi-label text categorization term-category weighted matrix. This method uses an inverse chi-square classifier to calculate an indicator value with respect to each category under consideration based the testing document’s feat...

Full description

Bibliographic Details
Main Authors: Ching-Ting Lu, 呂靜婷
Other Authors: 陸承志
Format: Others
Language:zh-TW
Published: 2008
Online Access:http://ndltd.ncl.edu.tw/handle/34451785132367205602
Description
Summary:碩士 === 元智大學 === 資訊管理學系 === 96 === This study presents a based method to multi-label text categorization term-category weighted matrix. This method uses an inverse chi-square classifier to calculate an indicator value with respect to each category under consideration based the testing document’s feature weights represented by correlation coefficient. We use three thresholds including DF (Document Frequency), CC (Correlated Coefficient) and ICF (Inverted Conformity Frequency), to extract different category’s relevant terms. Finally, we conduct experiments on the top 10 categories of Reuters 21578. The experimental results show that the Precision, Recall, F1-measure can reach 87%, 98%, 92%, respectively. Our method is shown to be comparable to the famous multi-label method, Boostexter.