Weight of Features in Automatic Data Classification
碩士 === 中原大學 === 資訊工程研究所 === 101 === It will be difficult for users to find needed documents if documents are not properly classified. Feature selection and screening in is an important step in automatic document classification. The documents in a Knowledge Management System are stemmed and then the...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2013
|
Online Access: | http://ndltd.ncl.edu.tw/handle/77191503658629811230 |
id |
ndltd-TW-101CYCU5392041 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-101CYCU53920412015-10-13T22:40:30Z http://ndltd.ncl.edu.tw/handle/77191503658629811230 Weight of Features in Automatic Data Classification 基於特徵權重做文本分類 CHANG-MING CHEN 陳昶旻 碩士 中原大學 資訊工程研究所 101 It will be difficult for users to find needed documents if documents are not properly classified. Feature selection and screening in is an important step in automatic document classification. The documents in a Knowledge Management System are stemmed and then the weight of each term associated with a document is calculated using TF-IDF. WordNet was used to screen the relevant keywords from example recipe files to compose the feature vectors. Cross-Validation method was used to train the training model. The unclassified documents and then classified using k-nearest-neighbor method using the training model. After classification, the documents are moved to the corresponding folder in KMS using the API of Vitas/KM. The accuracy was compared with the data without feature selection and screening. Chung-Shyan Liu 留忠賢 2013 學位論文 ; thesis 40 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 中原大學 === 資訊工程研究所 === 101 === It will be difficult for users to find needed documents if documents are not properly classified. Feature selection and screening in is an important step in automatic document classification. The documents in a Knowledge Management System are stemmed and then the weight of each term associated with a document is calculated using TF-IDF. WordNet was used to screen the relevant keywords from example recipe files to compose the feature vectors. Cross-Validation method was used to train the training model. The unclassified documents and then classified using k-nearest-neighbor method using the training model. After classification, the documents are moved to the corresponding folder in KMS using the API of Vitas/KM. The accuracy was compared with the data without feature selection and screening.
|
author2 |
Chung-Shyan Liu |
author_facet |
Chung-Shyan Liu CHANG-MING CHEN 陳昶旻 |
author |
CHANG-MING CHEN 陳昶旻 |
spellingShingle |
CHANG-MING CHEN 陳昶旻 Weight of Features in Automatic Data Classification |
author_sort |
CHANG-MING CHEN |
title |
Weight of Features in Automatic Data Classification |
title_short |
Weight of Features in Automatic Data Classification |
title_full |
Weight of Features in Automatic Data Classification |
title_fullStr |
Weight of Features in Automatic Data Classification |
title_full_unstemmed |
Weight of Features in Automatic Data Classification |
title_sort |
weight of features in automatic data classification |
publishDate |
2013 |
url |
http://ndltd.ncl.edu.tw/handle/77191503658629811230 |
work_keys_str_mv |
AT changmingchen weightoffeaturesinautomaticdataclassification AT chénchǎngmín weightoffeaturesinautomaticdataclassification AT changmingchen jīyútèzhēngquánzhòngzuòwénběnfēnlèi AT chénchǎngmín jīyútèzhēngquánzhòngzuòwénběnfēnlèi |
_version_ |
1718079184215998464 |