Application of Decision Tree Methods on Spam Filtering

碩士 === 淡江大學 === 統計學系碩士班 === 93 === As a result of the progress on computer science and the development of Internet, Email has been the important communication medium in daily life. Email Advertising becomes the most efficient technique in marketing, and therefore arises the problem about spam. The a...

Full description

Bibliographic Details
Main Authors:	Meng-Chuan, Tsai, 蔡孟娟
Other Authors:	Ching-Hsiang Chen
Format:	Others
Language:	zh-TW
Published:	2005
Online Access:	http://ndltd.ncl.edu.tw/handle/95491262592610440256

id	ndltd-TW-093TKU05337015
record_format	oai_dc
spelling	ndltd-TW-093TKU053370152015-10-13T11:57:26Z http://ndltd.ncl.edu.tw/handle/95491262592610440256 Application of Decision Tree Methods on Spam Filtering 決策樹法在垃圾郵件過濾之應用 Meng-Chuan, Tsai 蔡孟娟碩士淡江大學統計學系碩士班 93 As a result of the progress on computer science and the development of Internet, Email has been the important communication medium in daily life. Email Advertising becomes the most efficient technique in marketing, and therefore arises the problem about spam. The amounts of spam increase quickly. It not only takes the network resources and makes the burden on system, but also wastes the receiver’s time. Spam filtering becomes a popular research issue in recent years. In this study, we use three decision tree methods of data mining technology to classify Emails into “spam” and “legitimate” based on fourteen characteristics of Email. The three decision tree methods are compared with bayes classifier, which is most often used in spam filtering at present. When the efficiency of classification and misclassification costs are considered, C4.5 method has the best outcome in our case study of spam mails. It takes the shortest test time among the three decision tree methods. Our study also shows that we can avoid misclassifying legitimate by using the white list before we apply the classification. Ching-Hsiang Chen 陳景祥 2005 學位論文 ; thesis 63 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 淡江大學 === 統計學系碩士班 === 93 === As a result of the progress on computer science and the development of Internet, Email has been the important communication medium in daily life. Email Advertising becomes the most efficient technique in marketing, and therefore arises the problem about spam. The amounts of spam increase quickly. It not only takes the network resources and makes the burden on system, but also wastes the receiver’s time. Spam filtering becomes a popular research issue in recent years. In this study, we use three decision tree methods of data mining technology to classify Emails into “spam” and “legitimate” based on fourteen characteristics of Email. The three decision tree methods are compared with bayes classifier, which is most often used in spam filtering at present. When the efficiency of classification and misclassification costs are considered, C4.5 method has the best outcome in our case study of spam mails. It takes the shortest test time among the three decision tree methods. Our study also shows that we can avoid misclassifying legitimate by using the white list before we apply the classification.
author2	Ching-Hsiang Chen
author_facet	Ching-Hsiang Chen Meng-Chuan, Tsai 蔡孟娟
author	Meng-Chuan, Tsai 蔡孟娟
spellingShingle	Meng-Chuan, Tsai 蔡孟娟 Application of Decision Tree Methods on Spam Filtering
author_sort	Meng-Chuan, Tsai
title	Application of Decision Tree Methods on Spam Filtering
title_short	Application of Decision Tree Methods on Spam Filtering
title_full	Application of Decision Tree Methods on Spam Filtering
title_fullStr	Application of Decision Tree Methods on Spam Filtering
title_full_unstemmed	Application of Decision Tree Methods on Spam Filtering
title_sort	application of decision tree methods on spam filtering
publishDate	2005
url	http://ndltd.ncl.edu.tw/handle/95491262592610440256
work_keys_str_mv	AT mengchuantsai applicationofdecisiontreemethodsonspamfiltering AT càimèngjuān applicationofdecisiontreemethodsonspamfiltering AT mengchuantsai juécèshùfǎzàilājīyóujiànguòlǜzhīyīngyòng AT càimèngjuān juécèshùfǎzàilājīyóujiànguòlǜzhīyīngyòng
_version_	1716851818581983232

Application of Decision Tree Methods on Spam Filtering

Similar Items