A Deep Learning-Based Innovative Technique for Phishing Detection in Modern Security with Uniform Resource Locators

Organizations and individuals worldwide are becoming increasingly vulnerable to cyberattacks as phishing continues to grow and the number of phishing websites grows. As a result, improved cyber defense necessitates more effective phishing detection (PD). In this paper, we introduce a novel method fo...

Full description

Bibliographic Details
Main Authors: Aldakheel, E.A (Author), Almarshad, F.A (Author), Alzahrani, A.I.A (Author), Gashgari, G.A (Author), Zakariah, M. (Author)
Format: Article
Language:English
Published: MDPI 2023
Subjects:
Online Access:View Fulltext in Publisher
View in Scopus
LEADER 04028nam a2200541Ia 4500
001 10.3390-s23094403
008 230529s2023 CNT 000 0 und d
020 |a 14248220 (ISSN) 
245 1 0 |a A Deep Learning-Based Innovative Technique for Phishing Detection in Modern Security with Uniform Resource Locators 
260 0 |b MDPI  |c 2023 
856 |z View Fulltext in Publisher  |u https://doi.org/10.3390/s23094403 
856 |z View in Scopus  |u https://www.scopus.com/inward/record.uri?eid=2-s2.0-85159346775&doi=10.3390%2fs23094403&partnerID=40&md5=3f04b3c85c2d34cd8261d65adeb7a0da 
520 3 |a Organizations and individuals worldwide are becoming increasingly vulnerable to cyberattacks as phishing continues to grow and the number of phishing websites grows. As a result, improved cyber defense necessitates more effective phishing detection (PD). In this paper, we introduce a novel method for detecting phishing sites with high accuracy. Our approach utilizes a Convolution Neural Network (CNN)-based model for precise classification that effectively distinguishes legitimate websites from phishing websites. We evaluate the performance of our model on the PhishTank dataset, which is a widely used dataset for detecting phishing websites based solely on Uniform Resource Locators (URL) features. Our approach presents a unique contribution to the field of phishing detection by achieving high accuracy rates and outperforming previous state-of-the-art models. Experiment results revealed that our proposed method performs well in terms of accuracy and its false-positive rate. We created a real data set by crawling 10,000 phishing URLs from PhishTank and 10,000 legitimate websites and then ran experiments using standard evaluation metrics on the data sets. This approach is founded on integrated and deep learning (DL). The CNN-based model can distinguish phishing websites from legitimate websites with a high degree of accuracy. When binary-categorical loss and the Adam optimizer are used, the accuracy of the k-nearest neighbors (KNN), Natural Language Processing (NLP), Recurrent Neural Network (RNN), and Random Forest (RF) models is 87%, 97.98%, 97.4% and 94.26%, respectively, in contrast to previous publications. Our model outperformed previous works due to several factors, including the use of more layers and larger training sizes, and the extraction of additional features from the PhishTank dataset. Specifically, our proposed model comprises seven layers, starting with the input layer and progressing to the seventh, which incorporates a layer with pooling, convolutional, linear 1 and 2, and linear six layers as the output layers. These design choices contribute to the high accuracy of our model, which achieved a 98.77% accuracy rate. © 2023 by the authors. 
650 0 4 |a Computer crime 
650 0 4 |a Convolution 
650 0 4 |a convolutional neural network 
650 0 4 |a Convolutional neural network 
650 0 4 |a Convolutional neural networks 
650 0 4 |a Data set 
650 0 4 |a deep learning 
650 0 4 |a Deep learning 
650 0 4 |a Deep neural networks 
650 0 4 |a Detection system 
650 0 4 |a Learning algorithms 
650 0 4 |a Learning systems 
650 0 4 |a Long short-term memory 
650 0 4 |a machine-learning 
650 0 4 |a Machine-learning 
650 0 4 |a Multilayer neural networks 
650 0 4 |a Natural language processing systems 
650 0 4 |a Nearest neighbor search 
650 0 4 |a Network security 
650 0 4 |a phishing detection system 
650 0 4 |a Phishing detection system 
650 0 4 |a Phishing detections 
650 0 4 |a Phishing websites 
650 0 4 |a Phishtank data set 
650 0 4 |a PhishTank data set 
650 0 4 |a Reinforcement 
650 0 4 |a Uniform resource locator analyse 
650 0 4 |a URL analysis 
650 0 4 |a Websites 
700 1 0 |a Aldakheel, E.A.  |e author 
700 1 0 |a Almarshad, F.A.  |e author 
700 1 0 |a Alzahrani, A.I.A.  |e author 
700 1 0 |a Gashgari, G.A.  |e author 
700 1 0 |a Zakariah, M.  |e author 
773 |t Sensors