An In-Depth Benchmarking and Evaluation of Phishing Detection Research for Security Needs

We perform an in-depth, systematic benchmarking study and evaluation of phishing features on diverse and extensive datasets. We propose a new taxonomy of features based on the interpretation and purpose of each feature. Next, we propose a benchmarking framework called `PhishBench,' which enable...

Full description

Bibliographic Details
Main Authors: Ayman El Aassal, Shahryar Baki, Avisha Das, Rakesh M. Verma
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8970564/
id doaj-0486f29d81da4ad7a0ea2e91fbcedaa7
record_format Article
spelling doaj-0486f29d81da4ad7a0ea2e91fbcedaa72021-03-30T01:15:27ZengIEEEIEEE Access2169-35362020-01-018221702219210.1109/ACCESS.2020.29697808970564An In-Depth Benchmarking and Evaluation of Phishing Detection Research for Security NeedsAyman El Aassal0Shahryar Baki1https://orcid.org/0000-0002-9814-9270Avisha Das2Rakesh M. Verma3Department of Computer Science, University of Houston, Houston, TX, USADepartment of Computer Science, University of Houston, Houston, TX, USADepartment of Computer Science, University of Houston, Houston, TX, USADepartment of Computer Science, University of Houston, Houston, TX, USAWe perform an in-depth, systematic benchmarking study and evaluation of phishing features on diverse and extensive datasets. We propose a new taxonomy of features based on the interpretation and purpose of each feature. Next, we propose a benchmarking framework called `PhishBench,' which enables us to evaluate and compare the existing features for phishing detection systematically and thoroughly under identical experimental conditions, i.e., unified system specification, datasets, classifiers, and evaluation metrics. PhishBench is a first in the field of benchmarking phishing related research and incorporates thorough and systematic evaluation and feature comparison. We use PhishBench to test methods published in the phishing literature on new and diverse datasets to check their robustness and scalability. We study how dataset characteristics, e.g., varying legitimate to phishing ratios and increasing the size of imbalanced datasets, affect classification performance. Our results show that the imbalanced nature of phishing attacks affects the detection systems' performance and researchers should take this into account when proposing a new method. We also found that retraining alone is not enough to defeat new attacks. New features and techniques are required to stop attackers from fooling detection systems.https://ieeexplore.ieee.org/document/8970564/Feature engineeringfeature taxonomyframeworkphishing emailphishing URLphishing website
collection DOAJ
language English
format Article
sources DOAJ
author Ayman El Aassal
Shahryar Baki
Avisha Das
Rakesh M. Verma
spellingShingle Ayman El Aassal
Shahryar Baki
Avisha Das
Rakesh M. Verma
An In-Depth Benchmarking and Evaluation of Phishing Detection Research for Security Needs
IEEE Access
Feature engineering
feature taxonomy
framework
phishing email
phishing URL
phishing website
author_facet Ayman El Aassal
Shahryar Baki
Avisha Das
Rakesh M. Verma
author_sort Ayman El Aassal
title An In-Depth Benchmarking and Evaluation of Phishing Detection Research for Security Needs
title_short An In-Depth Benchmarking and Evaluation of Phishing Detection Research for Security Needs
title_full An In-Depth Benchmarking and Evaluation of Phishing Detection Research for Security Needs
title_fullStr An In-Depth Benchmarking and Evaluation of Phishing Detection Research for Security Needs
title_full_unstemmed An In-Depth Benchmarking and Evaluation of Phishing Detection Research for Security Needs
title_sort in-depth benchmarking and evaluation of phishing detection research for security needs
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description We perform an in-depth, systematic benchmarking study and evaluation of phishing features on diverse and extensive datasets. We propose a new taxonomy of features based on the interpretation and purpose of each feature. Next, we propose a benchmarking framework called `PhishBench,' which enables us to evaluate and compare the existing features for phishing detection systematically and thoroughly under identical experimental conditions, i.e., unified system specification, datasets, classifiers, and evaluation metrics. PhishBench is a first in the field of benchmarking phishing related research and incorporates thorough and systematic evaluation and feature comparison. We use PhishBench to test methods published in the phishing literature on new and diverse datasets to check their robustness and scalability. We study how dataset characteristics, e.g., varying legitimate to phishing ratios and increasing the size of imbalanced datasets, affect classification performance. Our results show that the imbalanced nature of phishing attacks affects the detection systems' performance and researchers should take this into account when proposing a new method. We also found that retraining alone is not enough to defeat new attacks. New features and techniques are required to stop attackers from fooling detection systems.
topic Feature engineering
feature taxonomy
framework
phishing email
phishing URL
phishing website
url https://ieeexplore.ieee.org/document/8970564/
work_keys_str_mv AT aymanelaassal anindepthbenchmarkingandevaluationofphishingdetectionresearchforsecurityneeds
AT shahryarbaki anindepthbenchmarkingandevaluationofphishingdetectionresearchforsecurityneeds
AT avishadas anindepthbenchmarkingandevaluationofphishingdetectionresearchforsecurityneeds
AT rakeshmverma anindepthbenchmarkingandevaluationofphishingdetectionresearchforsecurityneeds
AT aymanelaassal indepthbenchmarkingandevaluationofphishingdetectionresearchforsecurityneeds
AT shahryarbaki indepthbenchmarkingandevaluationofphishingdetectionresearchforsecurityneeds
AT avishadas indepthbenchmarkingandevaluationofphishingdetectionresearchforsecurityneeds
AT rakeshmverma indepthbenchmarkingandevaluationofphishingdetectionresearchforsecurityneeds
_version_ 1724187322209009664