SeFACED: Semantic-Based Forensic Analysis and Classification of E-Mail Data Using Deep Learning

Artificial Intelligence (AI), in combination with the Internet of Things (IoT), called (AIoT), an emerging trend in industrial applications, is capable of intelligent decision-making with self-driven analytics. With its extensive usage in diverse scenarios, IoT devices generate bulk data contrived b...

Full description

Bibliographic Details
Main Authors:	Maryam Hina, Mohsin Ali, Abdul Rehman Javed, Fahad Ghabban, Liaqat Ali Khan, Zunera Jalil
Format:	Article
Language:	English
Published:	IEEE 2021-01-01
Series:	IEEE Access
Subjects:	Artificial intelligence cybercrimes multiclass e-mail classification deep learning cybersecurity
Online Access:	https://ieeexplore.ieee.org/document/9477611/

id	doaj-293eef46caa348ffad6c0a2f231e4898
record_format	Article
spelling	doaj-293eef46caa348ffad6c0a2f231e48982021-07-15T23:00:18ZengIEEEIEEE Access2169-35362021-01-019983989841110.1109/ACCESS.2021.30957309477611SeFACED: Semantic-Based Forensic Analysis and Classification of E-Mail Data Using Deep LearningMaryam Hina0Mohsin Ali1Abdul Rehman Javed2https://orcid.org/0000-0002-0570-1813Fahad Ghabban3Liaqat Ali Khan4Zunera Jalil5https://orcid.org/0000-0003-2531-2564Department of Computer Science, Air University, Islamabad, PakistanDepartment of Computer Science, Air University, Islamabad, PakistanDepartment of Cyber Security, Air University, Islamabad, PakistanInformation System Department, College of Computer Science and Engineering, Taibah University, Medina, Saudi ArabiaDepartment of Cyber Security, Air University, Islamabad, PakistanDepartment of Cyber Security, Air University, Islamabad, PakistanArtificial Intelligence (AI), in combination with the Internet of Things (IoT), called (AIoT), an emerging trend in industrial applications, is capable of intelligent decision-making with self-driven analytics. With its extensive usage in diverse scenarios, IoT devices generate bulk data contrived by attackers to disrupt normal operations and services. Hence, there is a need for proactive data analysis to prevent cyber-attacks and crimes. To investigate crimes involving Electronic Mail (e-mail), analysis of both the header and the email body is required since the semantics of communication helps to identify the source of potential evidence. With the continued growth of data shared via emails, investigators now face the daunting challenge of extracting the required semantic information from the bulks of emails, thereby causing a delay in the investigation process. This gives an edge to the criminal in erasing their footprints of malicious acts. The existing keyword-based search techniques and filtration often result in extraneous, short sequence emails, which skips meaningful information. To overcome the above limitation, we propose a novel efficient approach named <italic>SeFACED</italic> that uses Long Short-Term Memory (LSTM) based Gated Recurrent Neural Network (GRU) for multiclass email classification. <italic>SeFACED</italic> not only works on short sequences but with long dependencies of 1000+ characters as well. <italic>SeFACED</italic> focuses on tuning LSTM based GRU parameters to attain the best performance and with assessment by comparing it with traditional machine learning, deep learning models, and state-of-the-art studies on the subject. Experimental results on self-extended benchmark datasets exhibit that <italic>SeFACED</italic> effectively outperforms existing methods while keeping the classification process robust and reliable.https://ieeexplore.ieee.org/document/9477611/Artificial intelligencecybercrimesmulticlass e-mail classificationdeep learningcybersecurity
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Maryam Hina Mohsin Ali Abdul Rehman Javed Fahad Ghabban Liaqat Ali Khan Zunera Jalil
spellingShingle	Maryam Hina Mohsin Ali Abdul Rehman Javed Fahad Ghabban Liaqat Ali Khan Zunera Jalil SeFACED: Semantic-Based Forensic Analysis and Classification of E-Mail Data Using Deep Learning IEEE Access Artificial intelligence cybercrimes multiclass e-mail classification deep learning cybersecurity
author_facet	Maryam Hina Mohsin Ali Abdul Rehman Javed Fahad Ghabban Liaqat Ali Khan Zunera Jalil
author_sort	Maryam Hina
title	SeFACED: Semantic-Based Forensic Analysis and Classification of E-Mail Data Using Deep Learning
title_short	SeFACED: Semantic-Based Forensic Analysis and Classification of E-Mail Data Using Deep Learning
title_full	SeFACED: Semantic-Based Forensic Analysis and Classification of E-Mail Data Using Deep Learning
title_fullStr	SeFACED: Semantic-Based Forensic Analysis and Classification of E-Mail Data Using Deep Learning
title_full_unstemmed	SeFACED: Semantic-Based Forensic Analysis and Classification of E-Mail Data Using Deep Learning
title_sort	sefaced: semantic-based forensic analysis and classification of e-mail data using deep learning
publisher	IEEE
series	IEEE Access
issn	2169-3536
publishDate	2021-01-01
description	Artificial Intelligence (AI), in combination with the Internet of Things (IoT), called (AIoT), an emerging trend in industrial applications, is capable of intelligent decision-making with self-driven analytics. With its extensive usage in diverse scenarios, IoT devices generate bulk data contrived by attackers to disrupt normal operations and services. Hence, there is a need for proactive data analysis to prevent cyber-attacks and crimes. To investigate crimes involving Electronic Mail (e-mail), analysis of both the header and the email body is required since the semantics of communication helps to identify the source of potential evidence. With the continued growth of data shared via emails, investigators now face the daunting challenge of extracting the required semantic information from the bulks of emails, thereby causing a delay in the investigation process. This gives an edge to the criminal in erasing their footprints of malicious acts. The existing keyword-based search techniques and filtration often result in extraneous, short sequence emails, which skips meaningful information. To overcome the above limitation, we propose a novel efficient approach named <italic>SeFACED</italic> that uses Long Short-Term Memory (LSTM) based Gated Recurrent Neural Network (GRU) for multiclass email classification. <italic>SeFACED</italic> not only works on short sequences but with long dependencies of 1000+ characters as well. <italic>SeFACED</italic> focuses on tuning LSTM based GRU parameters to attain the best performance and with assessment by comparing it with traditional machine learning, deep learning models, and state-of-the-art studies on the subject. Experimental results on self-extended benchmark datasets exhibit that <italic>SeFACED</italic> effectively outperforms existing methods while keeping the classification process robust and reliable.
topic	Artificial intelligence cybercrimes multiclass e-mail classification deep learning cybersecurity
url	https://ieeexplore.ieee.org/document/9477611/
work_keys_str_mv	AT maryamhina sefacedsemanticbasedforensicanalysisandclassificationofemaildatausingdeeplearning AT mohsinali sefacedsemanticbasedforensicanalysisandclassificationofemaildatausingdeeplearning AT abdulrehmanjaved sefacedsemanticbasedforensicanalysisandclassificationofemaildatausingdeeplearning AT fahadghabban sefacedsemanticbasedforensicanalysisandclassificationofemaildatausingdeeplearning AT liaqatalikhan sefacedsemanticbasedforensicanalysisandclassificationofemaildatausingdeeplearning AT zunerajalil sefacedsemanticbasedforensicanalysisandclassificationofemaildatausingdeeplearning
_version_	1721297940204486656

SeFACED: Semantic-Based Forensic Analysis and Classification of E-Mail Data Using Deep Learning

Similar Items