High-Speed Searching Target Data Traces Based on Statistical Sampling for Digital Forensics

As technology of manufacturing storage medium advances, data storage capacity has been increasing exponentially. This pervasiveness has made a forensic examination time-consuming and difficult. If a file system of data storage remains intact, an examiner can find files that would be important eviden...

Full description

Bibliographic Details
Main Authors: Doowon Jeong, Sangjin Lee
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8917643/
id doaj-5c57b642ffe8422fb71950c1d4f55dcc
record_format Article
spelling doaj-5c57b642ffe8422fb71950c1d4f55dcc2021-03-30T00:49:42ZengIEEEIEEE Access2169-35362019-01-01717226417227610.1109/ACCESS.2019.29566818917643High-Speed Searching Target Data Traces Based on Statistical Sampling for Digital ForensicsDoowon Jeong0https://orcid.org/0000-0001-7593-9416Sangjin Lee1https://orcid.org/0000-0002-6809-5179Digital Forensics Research Center, Korea University, Seoul, South KoreaDigital Forensics Research Center, Korea University, Seoul, South KoreaAs technology of manufacturing storage medium advances, data storage capacity has been increasing exponentially. This pervasiveness has made a forensic examination time-consuming and difficult. If a file system of data storage remains intact, an examiner can find files that would be important evidence by analyzing hierarchy, name, time information, etc. of files and folders. However, as anti-forensic techniques such as metadata destruction and disk format are widely known, the data search based on the file system becomes more impractical. Besides, significant evidences could be stored in the unallocated area; investigating the entire area of data storage is still important. The famous methods of exploring the existence of evidence are hash comparison and random sampling. The hash comparison that calculates hash for all sectors and compares them can detect all fragments of the evidence. However, it requires an enormous amount of time and computing resources. Whereas the random sampling takes much less time as it exploits a portion of data storage, but it involves the risk of false-negative; this fact is critical to forensic examiners. In this paper, we blend the merits of both methods to make false-negative zero and to reduce the processing time extremely at the same time. We use 16-byte values in a sector instead of traditional hash to filter out the unmatched sector. The values are statistically selected based on the frequency of occurrence according to offset. The effectiveness of our methodology is evaluated through several experiments.https://ieeexplore.ieee.org/document/8917643/Forensicscomputer crimesecuritydata acquisition
collection DOAJ
language English
format Article
sources DOAJ
author Doowon Jeong
Sangjin Lee
spellingShingle Doowon Jeong
Sangjin Lee
High-Speed Searching Target Data Traces Based on Statistical Sampling for Digital Forensics
IEEE Access
Forensics
computer crime
security
data acquisition
author_facet Doowon Jeong
Sangjin Lee
author_sort Doowon Jeong
title High-Speed Searching Target Data Traces Based on Statistical Sampling for Digital Forensics
title_short High-Speed Searching Target Data Traces Based on Statistical Sampling for Digital Forensics
title_full High-Speed Searching Target Data Traces Based on Statistical Sampling for Digital Forensics
title_fullStr High-Speed Searching Target Data Traces Based on Statistical Sampling for Digital Forensics
title_full_unstemmed High-Speed Searching Target Data Traces Based on Statistical Sampling for Digital Forensics
title_sort high-speed searching target data traces based on statistical sampling for digital forensics
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2019-01-01
description As technology of manufacturing storage medium advances, data storage capacity has been increasing exponentially. This pervasiveness has made a forensic examination time-consuming and difficult. If a file system of data storage remains intact, an examiner can find files that would be important evidence by analyzing hierarchy, name, time information, etc. of files and folders. However, as anti-forensic techniques such as metadata destruction and disk format are widely known, the data search based on the file system becomes more impractical. Besides, significant evidences could be stored in the unallocated area; investigating the entire area of data storage is still important. The famous methods of exploring the existence of evidence are hash comparison and random sampling. The hash comparison that calculates hash for all sectors and compares them can detect all fragments of the evidence. However, it requires an enormous amount of time and computing resources. Whereas the random sampling takes much less time as it exploits a portion of data storage, but it involves the risk of false-negative; this fact is critical to forensic examiners. In this paper, we blend the merits of both methods to make false-negative zero and to reduce the processing time extremely at the same time. We use 16-byte values in a sector instead of traditional hash to filter out the unmatched sector. The values are statistically selected based on the frequency of occurrence according to offset. The effectiveness of our methodology is evaluated through several experiments.
topic Forensics
computer crime
security
data acquisition
url https://ieeexplore.ieee.org/document/8917643/
work_keys_str_mv AT doowonjeong highspeedsearchingtargetdatatracesbasedonstatisticalsamplingfordigitalforensics
AT sangjinlee highspeedsearchingtargetdatatracesbasedonstatisticalsamplingfordigitalforensics
_version_ 1724187804840230912