Multi-label Text Categorization Using a Chi-Square Based Method
碩士 === 元智大學 === 資訊管理學系 === 96 === This study presents a based method to multi-label text categorization term-category weighted matrix. This method uses an inverse chi-square classifier to calculate an indicator value with respect to each category under consideration based the testing document’s feat...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2008
|
Online Access: | http://ndltd.ncl.edu.tw/handle/34451785132367205602 |
id |
ndltd-TW-096YZU05396055 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-096YZU053960552015-10-13T13:48:21Z http://ndltd.ncl.edu.tw/handle/34451785132367205602 Multi-label Text Categorization Using a Chi-Square Based Method 一個以卡方為基礎的文件多重分類方法 Ching-Ting Lu 呂靜婷 碩士 元智大學 資訊管理學系 96 This study presents a based method to multi-label text categorization term-category weighted matrix. This method uses an inverse chi-square classifier to calculate an indicator value with respect to each category under consideration based the testing document’s feature weights represented by correlation coefficient. We use three thresholds including DF (Document Frequency), CC (Correlated Coefficient) and ICF (Inverted Conformity Frequency), to extract different category’s relevant terms. Finally, we conduct experiments on the top 10 categories of Reuters 21578. The experimental results show that the Precision, Recall, F1-measure can reach 87%, 98%, 92%, respectively. Our method is shown to be comparable to the famous multi-label method, Boostexter. 陸承志 2008 學位論文 ; thesis 53 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 元智大學 === 資訊管理學系 === 96 === This study presents a based method to multi-label text categorization term-category weighted matrix. This method uses an inverse chi-square classifier to calculate an indicator value with respect to each category under consideration based the testing document’s feature weights represented by correlation coefficient.
We use three thresholds including DF (Document Frequency), CC (Correlated Coefficient) and ICF (Inverted Conformity Frequency), to extract different category’s relevant terms. Finally, we conduct experiments on the top 10 categories of Reuters 21578. The experimental results show that the Precision, Recall, F1-measure can reach 87%, 98%, 92%, respectively. Our method is shown to be comparable to the famous multi-label method, Boostexter.
|
author2 |
陸承志 |
author_facet |
陸承志 Ching-Ting Lu 呂靜婷 |
author |
Ching-Ting Lu 呂靜婷 |
spellingShingle |
Ching-Ting Lu 呂靜婷 Multi-label Text Categorization Using a Chi-Square Based Method |
author_sort |
Ching-Ting Lu |
title |
Multi-label Text Categorization Using a Chi-Square Based Method |
title_short |
Multi-label Text Categorization Using a Chi-Square Based Method |
title_full |
Multi-label Text Categorization Using a Chi-Square Based Method |
title_fullStr |
Multi-label Text Categorization Using a Chi-Square Based Method |
title_full_unstemmed |
Multi-label Text Categorization Using a Chi-Square Based Method |
title_sort |
multi-label text categorization using a chi-square based method |
publishDate |
2008 |
url |
http://ndltd.ncl.edu.tw/handle/34451785132367205602 |
work_keys_str_mv |
AT chingtinglu multilabeltextcategorizationusingachisquarebasedmethod AT lǚjìngtíng multilabeltextcategorizationusingachisquarebasedmethod AT chingtinglu yīgèyǐkǎfāngwèijīchǔdewénjiànduōzhòngfēnlèifāngfǎ AT lǚjìngtíng yīgèyǐkǎfāngwèijīchǔdewénjiànduōzhòngfēnlèifāngfǎ |
_version_ |
1717744077620903936 |