A Study of Using Data Mining Techniques for Text Classification

碩士 === 南台科技大學 === 資訊管理系 === 93 === In the wake of rapid information upgrade in the current society, to boost competitive edge, we must be able to present, apply, and receive data in an effective, organized, and accurate approach. This study aims at improving quality on text classification via inform...

Full description

Bibliographic Details
Main Authors:	Jen wen yan, 晏文珍
Other Authors:	Chui Cheng Chen
Format:	Others
Language:	zh-TW
Published:	2005
Online Access:	http://ndltd.ncl.edu.tw/handle/48144129555740914501

id	ndltd-TW-093STUT0396012
record_format	oai_dc
spelling	ndltd-TW-093STUT03960122016-11-22T04:12:22Z http://ndltd.ncl.edu.tw/handle/48144129555740914501 A Study of Using Data Mining Techniques for Text Classification 利用資料探勘技術於文件分類之研究 Jen wen yan 晏文珍碩士南台科技大學資訊管理系 93 In the wake of rapid information upgrade in the current society, to boost competitive edge, we must be able to present, apply, and receive data in an effective, organized, and accurate approach. This study aims at improving quality on text classification via informational and technical support. Data mining technology, currently the most popular application approach, retrieves useful knowledge and information from a vast amount of data. The technology has now been widely applied in journalism and search engines. This paper is going to seek the most proper classification principle out of news data by classification technology, one of the data mining technologies, and evaluate Reuters-21578 portfolio, work on the drawbacks of traditional classification approaches, and enhance text classification quality. The content of Reuters-21578 is the source for analysis. We classify news tile data based on the primary classification items. During the analysis process, we collect a huge amount of key words so as to find out the number of kinds words that articles are composed. Next, a ID3-tree of policies is made with classification analysis technology to realize the proper categories that key word items characteristics belong to, the research analyzes the aspects of frequency and weighted location for an enhanced accuracy of classification results, In this thesis, the first evaluating criterion is using the values of precision, recall to verify the effectiveness, it may be offer reference contributing to the automation of text classification. Chui Cheng Chen 陳垂呈 2005 學位論文 ; thesis 70 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 南台科技大學 === 資訊管理系 === 93 === In the wake of rapid information upgrade in the current society, to boost competitive edge, we must be able to present, apply, and receive data in an effective, organized, and accurate approach. This study aims at improving quality on text classification via informational and technical support. Data mining technology, currently the most popular application approach, retrieves useful knowledge and information from a vast amount of data. The technology has now been widely applied in journalism and search engines. This paper is going to seek the most proper classification principle out of news data by classification technology, one of the data mining technologies, and evaluate Reuters-21578 portfolio, work on the drawbacks of traditional classification approaches, and enhance text classification quality. The content of Reuters-21578 is the source for analysis. We classify news tile data based on the primary classification items. During the analysis process, we collect a huge amount of key words so as to find out the number of kinds words that articles are composed. Next, a ID3-tree of policies is made with classification analysis technology to realize the proper categories that key word items characteristics belong to, the research analyzes the aspects of frequency and weighted location for an enhanced accuracy of classification results, In this thesis, the first evaluating criterion is using the values of precision, recall to verify the effectiveness, it may be offer reference contributing to the automation of text classification.
author2	Chui Cheng Chen
author_facet	Chui Cheng Chen Jen wen yan 晏文珍
author	Jen wen yan 晏文珍
spellingShingle	Jen wen yan 晏文珍 A Study of Using Data Mining Techniques for Text Classification
author_sort	Jen wen yan
title	A Study of Using Data Mining Techniques for Text Classification
title_short	A Study of Using Data Mining Techniques for Text Classification
title_full	A Study of Using Data Mining Techniques for Text Classification
title_fullStr	A Study of Using Data Mining Techniques for Text Classification
title_full_unstemmed	A Study of Using Data Mining Techniques for Text Classification
title_sort	study of using data mining techniques for text classification
publishDate	2005
url	http://ndltd.ncl.edu.tw/handle/48144129555740914501
work_keys_str_mv	AT jenwenyan astudyofusingdataminingtechniquesfortextclassification AT yànwénzhēn astudyofusingdataminingtechniquesfortextclassification AT jenwenyan lìyòngzīliàotànkānjìshùyúwénjiànfēnlèizhīyánjiū AT yànwénzhēn lìyòngzīliàotànkānjìshùyúwénjiànfēnlèizhīyánjiū AT jenwenyan studyofusingdataminingtechniquesfortextclassification AT yànwénzhēn studyofusingdataminingtechniquesfortextclassification
_version_	1718396244586397696

A Study of Using Data Mining Techniques for Text Classification

Similar Items