Suspicious URL Filter based on Logistic Regression with Multi-view Analysis

碩士 === 國立臺灣科技大學 === 資訊工程系 === 100 === The current malicious URLs detecting techniques based on URL analysis are hard to find the malicious URLs infected via the obfuscated techniques (e.g., insertion of benign tokens). In this study, we propose an approach based on multi-view in order to reduce the...

Full description

Bibliographic Details
Main Authors:	Ke-wei Su, 蘇克維
Other Authors:	Han-ming Lee
Format:	Others
Language:	en_US
Published:	2012
Online Access:	http://ndltd.ncl.edu.tw/handle/3kf7h7

id	ndltd-TW-100NTUS5392041
record_format	oai_dc
spelling	ndltd-TW-100NTUS53920412019-05-15T20:43:22Z http://ndltd.ncl.edu.tw/handle/3kf7h7 Suspicious URL Filter based on Logistic Regression with Multi-view Analysis 可疑連結過濾器基於羅吉斯迴歸與多觀點分析 Ke-wei Su 蘇克維碩士國立臺灣科技大學資訊工程系 100 The current malicious URLs detecting techniques based on URL analysis are hard to find the malicious URLs infected via the obfuscated techniques (e.g., insertion of benign tokens). In this study, we propose an approach based on multi-view in order to reduce the impact from obfuscated techniques. The URLs are composed with several tokens, and each token has different meaning. The hackers use different obfuscated techniques with token combination on different portions, and these techniques have their own behavior. This mechanism intends to learn the behaviors from different portions of URLs (e.g., authority portions) for identifying the level of suspicion of each portion. With comparing the suspicious level of each parts between each URLs, this system would select the most suspicious URLs. This thesis makes following contributions: (1) Provide a multi-view mechanism for reducing the effect from obfuscated techniques, (2) Automatic filtering out the suspicious URLs without the need for additional configuration and modification in automatic way, (3) dealing with large scale and unbalance data with effectiveness, and (4) satisfying the requirements of industry. In the system evaluation, this thesis uses the real data set from T. Co.. According to the requirements of T. Co.: (1) detection rate should be less than 25%, (2) missing rate should be lower than 25%, and (3) the process with one hour data should be end in i a hour. The experimental results show that our approach is effective, and is with the ability to find more malicious URLs and satisfy the requirements given by practical environment as well as T. Co.. Han-ming Lee 李漢銘 2012 學位論文 ; thesis 45 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 國立臺灣科技大學 === 資訊工程系 === 100 === The current malicious URLs detecting techniques based on URL analysis are hard to find the malicious URLs infected via the obfuscated techniques (e.g., insertion of benign tokens). In this study, we propose an approach based on multi-view in order to reduce the impact from obfuscated techniques. The URLs are composed with several tokens, and each token has different meaning. The hackers use different obfuscated techniques with token combination on different portions, and these techniques have their own behavior. This mechanism intends to learn the behaviors from different portions of URLs (e.g., authority portions) for identifying the level of suspicion of each portion. With comparing the suspicious level of each parts between each URLs, this system would select the most suspicious URLs. This thesis makes following contributions: (1) Provide a multi-view mechanism for reducing the effect from obfuscated techniques, (2) Automatic filtering out the suspicious URLs without the need for additional configuration and modification in automatic way, (3) dealing with large scale and unbalance data with effectiveness, and (4) satisfying the requirements of industry. In the system evaluation, this thesis uses the real data set from T. Co.. According to the requirements of T. Co.: (1) detection rate should be less than 25%, (2) missing rate should be lower than 25%, and (3) the process with one hour data should be end in i a hour. The experimental results show that our approach is effective, and is with the ability to find more malicious URLs and satisfy the requirements given by practical environment as well as T. Co..
author2	Han-ming Lee
author_facet	Han-ming Lee Ke-wei Su 蘇克維
author	Ke-wei Su 蘇克維
spellingShingle	Ke-wei Su 蘇克維 Suspicious URL Filter based on Logistic Regression with Multi-view Analysis
author_sort	Ke-wei Su
title	Suspicious URL Filter based on Logistic Regression with Multi-view Analysis
title_short	Suspicious URL Filter based on Logistic Regression with Multi-view Analysis
title_full	Suspicious URL Filter based on Logistic Regression with Multi-view Analysis
title_fullStr	Suspicious URL Filter based on Logistic Regression with Multi-view Analysis
title_full_unstemmed	Suspicious URL Filter based on Logistic Regression with Multi-view Analysis
title_sort	suspicious url filter based on logistic regression with multi-view analysis
publishDate	2012
url	http://ndltd.ncl.edu.tw/handle/3kf7h7
work_keys_str_mv	AT keweisu suspiciousurlfilterbasedonlogisticregressionwithmultiviewanalysis AT sūkèwéi suspiciousurlfilterbasedonlogisticregressionwithmultiviewanalysis AT keweisu kěyíliánjiéguòlǜqìjīyúluójísīhuíguīyǔduōguāndiǎnfēnxī AT sūkèwéi kěyíliánjiéguòlǜqìjīyúluójísīhuíguīyǔduōguāndiǎnfēnxī
_version_	1719104641154678784

Suspicious URL Filter based on Logistic Regression with Multi-view Analysis

Similar Items