URL Crawling & classification system

Today, malware is often found on legitimate web sites that have been hacked. The aim of this thesis was to create a system to crawl potential malicious web sites and rate them as malicious or not. Through research into current malware trends and mechanisms to detect malware on the web, we analyzed a...

Full description

Bibliographic Details
Main Author: Vaagland, Emil Lindgjerdet
Format: Others
Language:English
Published: Norges teknisk-naturvitenskapelige universitet, Institutt for telematikk 2012
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-18764
id ndltd-UPSALLA1-oai-DiVA.org-ntnu-18764
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-ntnu-187642013-01-08T13:45:09ZURL Crawling & classification systemengVaagland, Emil LindgjerdetNorges teknisk-naturvitenskapelige universitet, Institutt for telematikkInstitutt for telematikk2012ntnudaim:7708MTKOM kommunikasjonsteknologiInformasjonssikkerhetToday, malware is often found on legitimate web sites that have been hacked. The aim of this thesis was to create a system to crawl potential malicious web sites and rate them as malicious or not. Through research into current malware trends and mechanisms to detect malware on the web, we analyzed and discussed the problem space, before we began design the system architecture. After we had implemented our suggested architecture, we ran the system through tests. These test shed some light on the challenges we had discussed. We found that our hybrid honey-client approach was of benefit to detect malicious sites, as some malicious sites were only found when both honey-clients cooperated. In addition, we got insight into how a LIHC can be useful as a queue pre-processor tool for a HIHC. On top of that, we learned the consequence of operating a system like this without a well built proxy server network: false-negatives. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-18764Local ntnudaim:7708application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Others
sources NDLTD
topic ntnudaim:7708
MTKOM kommunikasjonsteknologi
Informasjonssikkerhet
spellingShingle ntnudaim:7708
MTKOM kommunikasjonsteknologi
Informasjonssikkerhet
Vaagland, Emil Lindgjerdet
URL Crawling & classification system
description Today, malware is often found on legitimate web sites that have been hacked. The aim of this thesis was to create a system to crawl potential malicious web sites and rate them as malicious or not. Through research into current malware trends and mechanisms to detect malware on the web, we analyzed and discussed the problem space, before we began design the system architecture. After we had implemented our suggested architecture, we ran the system through tests. These test shed some light on the challenges we had discussed. We found that our hybrid honey-client approach was of benefit to detect malicious sites, as some malicious sites were only found when both honey-clients cooperated. In addition, we got insight into how a LIHC can be useful as a queue pre-processor tool for a HIHC. On top of that, we learned the consequence of operating a system like this without a well built proxy server network: false-negatives.
author Vaagland, Emil Lindgjerdet
author_facet Vaagland, Emil Lindgjerdet
author_sort Vaagland, Emil Lindgjerdet
title URL Crawling & classification system
title_short URL Crawling & classification system
title_full URL Crawling & classification system
title_fullStr URL Crawling & classification system
title_full_unstemmed URL Crawling & classification system
title_sort url crawling & classification system
publisher Norges teknisk-naturvitenskapelige universitet, Institutt for telematikk
publishDate 2012
url http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-18764
work_keys_str_mv AT vaaglandemillindgjerdet urlcrawlingclassificationsystem
_version_ 1716527945629040640