Weight of Features in Automatic Data Classification

碩士 === 中原大學 === 資訊工程研究所 === 101 === It will be difficult for users to find needed documents if documents are not properly classified. Feature selection and screening in is an important step in automatic document classification. The documents in a Knowledge Management System are stemmed and then the...

Full description

Bibliographic Details
Main Authors: CHANG-MING CHEN, 陳昶旻
Other Authors: Chung-Shyan Liu
Format: Others
Language:zh-TW
Published: 2013
Online Access:http://ndltd.ncl.edu.tw/handle/77191503658629811230
id ndltd-TW-101CYCU5392041
record_format oai_dc
spelling ndltd-TW-101CYCU53920412015-10-13T22:40:30Z http://ndltd.ncl.edu.tw/handle/77191503658629811230 Weight of Features in Automatic Data Classification 基於特徵權重做文本分類 CHANG-MING CHEN 陳昶旻 碩士 中原大學 資訊工程研究所 101 It will be difficult for users to find needed documents if documents are not properly classified. Feature selection and screening in is an important step in automatic document classification. The documents in a Knowledge Management System are stemmed and then the weight of each term associated with a document is calculated using TF-IDF. WordNet was used to screen the relevant keywords from example recipe files to compose the feature vectors. Cross-Validation method was used to train the training model. The unclassified documents and then classified using k-nearest-neighbor method using the training model. After classification, the documents are moved to the corresponding folder in KMS using the API of Vitas/KM. The accuracy was compared with the data without feature selection and screening. Chung-Shyan Liu 留忠賢 2013 學位論文 ; thesis 40 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 中原大學 === 資訊工程研究所 === 101 === It will be difficult for users to find needed documents if documents are not properly classified. Feature selection and screening in is an important step in automatic document classification. The documents in a Knowledge Management System are stemmed and then the weight of each term associated with a document is calculated using TF-IDF. WordNet was used to screen the relevant keywords from example recipe files to compose the feature vectors. Cross-Validation method was used to train the training model. The unclassified documents and then classified using k-nearest-neighbor method using the training model. After classification, the documents are moved to the corresponding folder in KMS using the API of Vitas/KM. The accuracy was compared with the data without feature selection and screening.
author2 Chung-Shyan Liu
author_facet Chung-Shyan Liu
CHANG-MING CHEN
陳昶旻
author CHANG-MING CHEN
陳昶旻
spellingShingle CHANG-MING CHEN
陳昶旻
Weight of Features in Automatic Data Classification
author_sort CHANG-MING CHEN
title Weight of Features in Automatic Data Classification
title_short Weight of Features in Automatic Data Classification
title_full Weight of Features in Automatic Data Classification
title_fullStr Weight of Features in Automatic Data Classification
title_full_unstemmed Weight of Features in Automatic Data Classification
title_sort weight of features in automatic data classification
publishDate 2013
url http://ndltd.ncl.edu.tw/handle/77191503658629811230
work_keys_str_mv AT changmingchen weightoffeaturesinautomaticdataclassification
AT chénchǎngmín weightoffeaturesinautomaticdataclassification
AT changmingchen jīyútèzhēngquánzhòngzuòwénběnfēnlèi
AT chénchǎngmín jīyútèzhēngquánzhòngzuòwénběnfēnlèi
_version_ 1718079184215998464