Multi-label Text Categorization Using a Chi-Square Based Method

碩士 === 元智大學 === 資訊管理學系 === 96 === This study presents a based method to multi-label text categorization term-category weighted matrix. This method uses an inverse chi-square classifier to calculate an indicator value with respect to each category under consideration based the testing document’s feat...

Full description

Bibliographic Details
Main Authors: Ching-Ting Lu, 呂靜婷
Other Authors: 陸承志
Format: Others
Language:zh-TW
Published: 2008
Online Access:http://ndltd.ncl.edu.tw/handle/34451785132367205602
id ndltd-TW-096YZU05396055
record_format oai_dc
spelling ndltd-TW-096YZU053960552015-10-13T13:48:21Z http://ndltd.ncl.edu.tw/handle/34451785132367205602 Multi-label Text Categorization Using a Chi-Square Based Method 一個以卡方為基礎的文件多重分類方法 Ching-Ting Lu 呂靜婷 碩士 元智大學 資訊管理學系 96 This study presents a based method to multi-label text categorization term-category weighted matrix. This method uses an inverse chi-square classifier to calculate an indicator value with respect to each category under consideration based the testing document’s feature weights represented by correlation coefficient. We use three thresholds including DF (Document Frequency), CC (Correlated Coefficient) and ICF (Inverted Conformity Frequency), to extract different category’s relevant terms. Finally, we conduct experiments on the top 10 categories of Reuters 21578. The experimental results show that the Precision, Recall, F1-measure can reach 87%, 98%, 92%, respectively. Our method is shown to be comparable to the famous multi-label method, Boostexter. 陸承志 2008 學位論文 ; thesis 53 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 元智大學 === 資訊管理學系 === 96 === This study presents a based method to multi-label text categorization term-category weighted matrix. This method uses an inverse chi-square classifier to calculate an indicator value with respect to each category under consideration based the testing document’s feature weights represented by correlation coefficient. We use three thresholds including DF (Document Frequency), CC (Correlated Coefficient) and ICF (Inverted Conformity Frequency), to extract different category’s relevant terms. Finally, we conduct experiments on the top 10 categories of Reuters 21578. The experimental results show that the Precision, Recall, F1-measure can reach 87%, 98%, 92%, respectively. Our method is shown to be comparable to the famous multi-label method, Boostexter.
author2 陸承志
author_facet 陸承志
Ching-Ting Lu
呂靜婷
author Ching-Ting Lu
呂靜婷
spellingShingle Ching-Ting Lu
呂靜婷
Multi-label Text Categorization Using a Chi-Square Based Method
author_sort Ching-Ting Lu
title Multi-label Text Categorization Using a Chi-Square Based Method
title_short Multi-label Text Categorization Using a Chi-Square Based Method
title_full Multi-label Text Categorization Using a Chi-Square Based Method
title_fullStr Multi-label Text Categorization Using a Chi-Square Based Method
title_full_unstemmed Multi-label Text Categorization Using a Chi-Square Based Method
title_sort multi-label text categorization using a chi-square based method
publishDate 2008
url http://ndltd.ncl.edu.tw/handle/34451785132367205602
work_keys_str_mv AT chingtinglu multilabeltextcategorizationusingachisquarebasedmethod
AT lǚjìngtíng multilabeltextcategorizationusingachisquarebasedmethod
AT chingtinglu yīgèyǐkǎfāngwèijīchǔdewénjiànduōzhòngfēnlèifāngfǎ
AT lǚjìngtíng yīgèyǐkǎfāngwèijīchǔdewénjiànduōzhòngfēnlèifāngfǎ
_version_ 1717744077620903936