A Study of Using Data Mining Techniques for Text Classification

碩士 === 南台科技大學 === 資訊管理系 === 93 === In the wake of rapid information upgrade in the current society, to boost competitive edge, we must be able to present, apply, and receive data in an effective, organized, and accurate approach. This study aims at improving quality on text classification via inform...

Full description

Bibliographic Details
Main Authors: Jen wen yan, 晏文珍
Other Authors: Chui Cheng Chen
Format: Others
Language:zh-TW
Published: 2005
Online Access:http://ndltd.ncl.edu.tw/handle/48144129555740914501
id ndltd-TW-093STUT0396012
record_format oai_dc
spelling ndltd-TW-093STUT03960122016-11-22T04:12:22Z http://ndltd.ncl.edu.tw/handle/48144129555740914501 A Study of Using Data Mining Techniques for Text Classification 利用資料探勘技術於文件分類之研究 Jen wen yan 晏文珍 碩士 南台科技大學 資訊管理系 93 In the wake of rapid information upgrade in the current society, to boost competitive edge, we must be able to present, apply, and receive data in an effective, organized, and accurate approach. This study aims at improving quality on text classification via informational and technical support. Data mining technology, currently the most popular application approach, retrieves useful knowledge and information from a vast amount of data. The technology has now been widely applied in journalism and search engines. This paper is going to seek the most proper classification principle out of news data by classification technology, one of the data mining technologies, and evaluate Reuters-21578 portfolio, work on the drawbacks of traditional classification approaches, and enhance text classification quality. The content of Reuters-21578 is the source for analysis. We classify news tile data based on the primary classification items. During the analysis process, we collect a huge amount of key words so as to find out the number of kinds words that articles are composed. Next, a ID3-tree of policies is made with classification analysis technology to realize the proper categories that key word items characteristics belong to, the research analyzes the aspects of frequency and weighted location for an enhanced accuracy of classification results, In this thesis, the first evaluating criterion is using the values of precision, recall to verify the effectiveness, it may be offer reference contributing to the automation of text classification. Chui Cheng Chen 陳垂呈 2005 學位論文 ; thesis 70 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 南台科技大學 === 資訊管理系 === 93 === In the wake of rapid information upgrade in the current society, to boost competitive edge, we must be able to present, apply, and receive data in an effective, organized, and accurate approach. This study aims at improving quality on text classification via informational and technical support. Data mining technology, currently the most popular application approach, retrieves useful knowledge and information from a vast amount of data. The technology has now been widely applied in journalism and search engines. This paper is going to seek the most proper classification principle out of news data by classification technology, one of the data mining technologies, and evaluate Reuters-21578 portfolio, work on the drawbacks of traditional classification approaches, and enhance text classification quality. The content of Reuters-21578 is the source for analysis. We classify news tile data based on the primary classification items. During the analysis process, we collect a huge amount of key words so as to find out the number of kinds words that articles are composed. Next, a ID3-tree of policies is made with classification analysis technology to realize the proper categories that key word items characteristics belong to, the research analyzes the aspects of frequency and weighted location for an enhanced accuracy of classification results, In this thesis, the first evaluating criterion is using the values of precision, recall to verify the effectiveness, it may be offer reference contributing to the automation of text classification.
author2 Chui Cheng Chen
author_facet Chui Cheng Chen
Jen wen yan
晏文珍
author Jen wen yan
晏文珍
spellingShingle Jen wen yan
晏文珍
A Study of Using Data Mining Techniques for Text Classification
author_sort Jen wen yan
title A Study of Using Data Mining Techniques for Text Classification
title_short A Study of Using Data Mining Techniques for Text Classification
title_full A Study of Using Data Mining Techniques for Text Classification
title_fullStr A Study of Using Data Mining Techniques for Text Classification
title_full_unstemmed A Study of Using Data Mining Techniques for Text Classification
title_sort study of using data mining techniques for text classification
publishDate 2005
url http://ndltd.ncl.edu.tw/handle/48144129555740914501
work_keys_str_mv AT jenwenyan astudyofusingdataminingtechniquesfortextclassification
AT yànwénzhēn astudyofusingdataminingtechniquesfortextclassification
AT jenwenyan lìyòngzīliàotànkānjìshùyúwénjiànfēnlèizhīyánjiū
AT yànwénzhēn lìyòngzīliàotànkānjìshùyúwénjiànfēnlèizhīyánjiū
AT jenwenyan studyofusingdataminingtechniquesfortextclassification
AT yànwénzhēn studyofusingdataminingtechniquesfortextclassification
_version_ 1718396244586397696