An Efficient Big Data Anonymization Algorithm Based on Chaos and Perturbation Techniques
The topic of big data has attracted increasing interest in recent years. The emergence of big data leads to new difficulties in terms of protection models used for data privacy, which is of necessity for sharing and processing data. Protecting individuals’ sensitive information while maint...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2018-05-01
|
Series: | Entropy |
Subjects: | |
Online Access: | http://www.mdpi.com/1099-4300/20/5/373 |
id |
doaj-7d8f3dfa8cf04a7cb737a370757a2611 |
---|---|
record_format |
Article |
spelling |
doaj-7d8f3dfa8cf04a7cb737a370757a26112020-11-24T21:03:00ZengMDPI AGEntropy1099-43002018-05-0120537310.3390/e20050373e20050373An Efficient Big Data Anonymization Algorithm Based on Chaos and Perturbation TechniquesCan Eyupoglu0Muhammed Ali Aydin1Abdul Halim Zaim2Ahmet Sertbas3Department of Computer Engineering, Istanbul Commerce University, Istanbul 34840, TurkeyDepartment of Computer Engineering, Istanbul University, Istanbul 34320, TurkeyDepartment of Computer Engineering, Istanbul Commerce University, Istanbul 34840, TurkeyDepartment of Computer Engineering, Istanbul University, Istanbul 34320, TurkeyThe topic of big data has attracted increasing interest in recent years. The emergence of big data leads to new difficulties in terms of protection models used for data privacy, which is of necessity for sharing and processing data. Protecting individuals’ sensitive information while maintaining the usability of the data set published is the most important challenge in privacy preserving. In this regard, data anonymization methods are utilized in order to protect data against identity disclosure and linking attacks. In this study, a novel data anonymization algorithm based on chaos and perturbation has been proposed for privacy and utility preserving in big data. The performance of the proposed algorithm is evaluated in terms of Kullback–Leibler divergence, probabilistic anonymity, classification accuracy, F-measure and execution time. The experimental results have shown that the proposed algorithm is efficient and performs better in terms of Kullback–Leibler divergence, classification accuracy and F-measure compared to most of the existing algorithms using the same data set. Resulting from applying chaos to perturb data, such successful algorithm is promising to be used in privacy preserving data mining and data publishing.http://www.mdpi.com/1099-4300/20/5/373big datachaosdata anonymizationdata perturbationprivacy preserving |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Can Eyupoglu Muhammed Ali Aydin Abdul Halim Zaim Ahmet Sertbas |
spellingShingle |
Can Eyupoglu Muhammed Ali Aydin Abdul Halim Zaim Ahmet Sertbas An Efficient Big Data Anonymization Algorithm Based on Chaos and Perturbation Techniques Entropy big data chaos data anonymization data perturbation privacy preserving |
author_facet |
Can Eyupoglu Muhammed Ali Aydin Abdul Halim Zaim Ahmet Sertbas |
author_sort |
Can Eyupoglu |
title |
An Efficient Big Data Anonymization Algorithm Based on Chaos and Perturbation Techniques |
title_short |
An Efficient Big Data Anonymization Algorithm Based on Chaos and Perturbation Techniques |
title_full |
An Efficient Big Data Anonymization Algorithm Based on Chaos and Perturbation Techniques |
title_fullStr |
An Efficient Big Data Anonymization Algorithm Based on Chaos and Perturbation Techniques |
title_full_unstemmed |
An Efficient Big Data Anonymization Algorithm Based on Chaos and Perturbation Techniques |
title_sort |
efficient big data anonymization algorithm based on chaos and perturbation techniques |
publisher |
MDPI AG |
series |
Entropy |
issn |
1099-4300 |
publishDate |
2018-05-01 |
description |
The topic of big data has attracted increasing interest in recent years. The emergence of big data leads to new difficulties in terms of protection models used for data privacy, which is of necessity for sharing and processing data. Protecting individuals’ sensitive information while maintaining the usability of the data set published is the most important challenge in privacy preserving. In this regard, data anonymization methods are utilized in order to protect data against identity disclosure and linking attacks. In this study, a novel data anonymization algorithm based on chaos and perturbation has been proposed for privacy and utility preserving in big data. The performance of the proposed algorithm is evaluated in terms of Kullback–Leibler divergence, probabilistic anonymity, classification accuracy, F-measure and execution time. The experimental results have shown that the proposed algorithm is efficient and performs better in terms of Kullback–Leibler divergence, classification accuracy and F-measure compared to most of the existing algorithms using the same data set. Resulting from applying chaos to perturb data, such successful algorithm is promising to be used in privacy preserving data mining and data publishing. |
topic |
big data chaos data anonymization data perturbation privacy preserving |
url |
http://www.mdpi.com/1099-4300/20/5/373 |
work_keys_str_mv |
AT caneyupoglu anefficientbigdataanonymizationalgorithmbasedonchaosandperturbationtechniques AT muhammedaliaydin anefficientbigdataanonymizationalgorithmbasedonchaosandperturbationtechniques AT abdulhalimzaim anefficientbigdataanonymizationalgorithmbasedonchaosandperturbationtechniques AT ahmetsertbas anefficientbigdataanonymizationalgorithmbasedonchaosandperturbationtechniques AT caneyupoglu efficientbigdataanonymizationalgorithmbasedonchaosandperturbationtechniques AT muhammedaliaydin efficientbigdataanonymizationalgorithmbasedonchaosandperturbationtechniques AT abdulhalimzaim efficientbigdataanonymizationalgorithmbasedonchaosandperturbationtechniques AT ahmetsertbas efficientbigdataanonymizationalgorithmbasedonchaosandperturbationtechniques |
_version_ |
1716774587811758080 |