Using feature selection and classification approaches in credit score and spam filtering
碩士 === 國立暨南國際大學 === 資訊管理學系 === 95 === Data mining is a popular technology recently. For example, it can apply to spam filtering, credit risks, medical, financial forecasting and industry, etc. The technology can forecast various situations precisely and effectively. The classification technique is a...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2007
|
Online Access: | http://ndltd.ncl.edu.tw/handle/27394857735362845888 |
id |
ndltd-TW-095NCNU0396014 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-095NCNU03960142015-10-13T16:41:22Z http://ndltd.ncl.edu.tw/handle/27394857735362845888 Using feature selection and classification approaches in credit score and spam filtering 結合特徵擷取與分類技術於信用評分與過濾垃圾郵件之應用 Yen-Wei Su 蘇彥暐 碩士 國立暨南國際大學 資訊管理學系 95 Data mining is a popular technology recently. For example, it can apply to spam filtering, credit risks, medical, financial forecasting and industry, etc. The technology can forecast various situations precisely and effectively. The classification technique is a important domain in the data mining. In our research, we used these tools to filter spam and the credit card, and combined the feature selection technology to find the important attributes from original data to improve the accuracy. Credit score is an index to judge the risk of break with applicants, the purpose avoids the loss of company or banks. But from the large number of customer’s database, it may cause the error from the noise (irrelative attribute). Hence, the experiment brings up a hybrid method, using factor analysis to filter the noise before classifying in my experiment. Besides, a major research is spam filtering. Many filtering techniques such as Back propagation network, Decision tree, Bayesian filtering or Support vector machine etc., resist the spam because of spam always besets users. And the filtering targets often different, likes as black white list, heading of mail or contents of keywords in the mail. So, we hope to combine the feature selection and filtering methods to improve the accuracy of classification. The content of the mail is the target in this experiment, it including: 1. The keywords in the mail, 2. The heuristic feature and 3. Both. The research will compare the three conditions and different filtering approaches, and get the best result of classification. The two experiment results indicate that using feature selection and classification approaches can improve the accuracy and reduce the noise. Ping-Feng Pai 白炳豐 2007 學位論文 ; thesis 61 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立暨南國際大學 === 資訊管理學系 === 95 === Data mining is a popular technology recently. For example, it can apply to spam filtering, credit risks, medical, financial forecasting and industry, etc. The technology can forecast various situations precisely and effectively. The classification technique is a important domain in the data mining. In our research, we used these tools to filter spam and the credit card, and combined the feature selection technology to find the important attributes from original data to improve the accuracy. Credit score is an index to judge the risk of break with applicants, the purpose avoids the loss of company or banks. But from the large number of customer’s database, it may cause the error from the noise (irrelative attribute). Hence, the experiment brings up a hybrid method, using factor analysis to filter the noise before classifying in my experiment.
Besides, a major research is spam filtering. Many filtering techniques such as Back propagation network, Decision tree, Bayesian filtering or Support vector machine etc., resist the spam because of spam always besets users. And the filtering targets often different, likes as black white list, heading of mail or contents of keywords in the mail. So, we hope to combine the feature selection and filtering methods to improve the accuracy of classification. The content of the mail is the target in this experiment, it including: 1. The keywords in the mail, 2. The heuristic feature and 3. Both. The research will compare the three conditions and different filtering approaches, and get the best result of classification. The two experiment results indicate that using feature selection and classification approaches can improve the accuracy and reduce the noise.
|
author2 |
Ping-Feng Pai |
author_facet |
Ping-Feng Pai Yen-Wei Su 蘇彥暐 |
author |
Yen-Wei Su 蘇彥暐 |
spellingShingle |
Yen-Wei Su 蘇彥暐 Using feature selection and classification approaches in credit score and spam filtering |
author_sort |
Yen-Wei Su |
title |
Using feature selection and classification approaches in credit score and spam filtering |
title_short |
Using feature selection and classification approaches in credit score and spam filtering |
title_full |
Using feature selection and classification approaches in credit score and spam filtering |
title_fullStr |
Using feature selection and classification approaches in credit score and spam filtering |
title_full_unstemmed |
Using feature selection and classification approaches in credit score and spam filtering |
title_sort |
using feature selection and classification approaches in credit score and spam filtering |
publishDate |
2007 |
url |
http://ndltd.ncl.edu.tw/handle/27394857735362845888 |
work_keys_str_mv |
AT yenweisu usingfeatureselectionandclassificationapproachesincreditscoreandspamfiltering AT sūyànwěi usingfeatureselectionandclassificationapproachesincreditscoreandspamfiltering AT yenweisu jiéhétèzhēngxiéqǔyǔfēnlèijìshùyúxìnyòngpíngfēnyǔguòlǜlājīyóujiànzhīyīngyòng AT sūyànwěi jiéhétèzhēngxiéqǔyǔfēnlèijìshùyúxìnyòngpíngfēnyǔguòlǜlājīyóujiànzhīyīngyòng |
_version_ |
1717773781438562304 |