A study of EXT4 file system with data deduplication support on NVM

碩士 === 淡江大學 === 電機工程學系碩士班 === 106 === With the continuous development of technologies, many new techniques and products have emerged. Most of these new technologies and products yield huge amount of data and require high speed of reading and writing data. Therefore, the needs of storage are comparat...

Full description

Bibliographic Details
Main Authors: Hsiang-Yuan Chuang, 莊翔淵
Other Authors: Hsin-Wen Wei
Format: Others
Language:zh-TW
Published: 2018
Online Access:http://ndltd.ncl.edu.tw/handle/xztyuq
Description
Summary:碩士 === 淡江大學 === 電機工程學系碩士班 === 106 === With the continuous development of technologies, many new techniques and products have emerged. Most of these new technologies and products yield huge amount of data and require high speed of reading and writing data. Therefore, the needs of storage are comparatively increased and the Non-Volatile Memory (NVM), which considers both the read/write speed and the data capacity becomes an important storage medium. However, NVM has a relatively small data capacity compared to traditional storage such as disk, it is important to reduce the needed storage space of data. The technology for saving data storage space is mainly divided into two categories, data compression and data deduplication. Data deduplication will delete multiple copies of the same data on the computer and leaving only one copy. The needed space of data can therefore be reduced and the speed of data writing can be improved. There are two mainly techniques in deduplication, i.e., file level deduplication and block level deduplication. File level deduplication considers a file as a unit for dedupe, whereas block level deduplication cuts a file into data blocks and considers a block as a unit for dedupe. Block level deduplication can greatly reduce the storage space compared to file level deduplication. Therefore, in this thesis, we strengthen the ability of the EXT4 file system with data deduplication functionality. To make the file system have better performance on NVM and save more space, we made some changes to the structure of filesystem and utilize the Extent structure of EXT4 to track every data block for searching the same data block and for deduplication. The proposed filesystem called DeEXT enable EXT4 to support block-level deduplication efficiently while writing data into NVM storage. As the simulation and analysis results show in this paper, DeEXT4 filesystem can effectively reduce the duplicate data written into the storage, and reduce larger amount of metadata if the file duplicated rate is higher.