Privacy preserving data publishing based on sensitivity in context of Big Data using Hive

Abstract Privacy preserving data publication is the main concern in present days, because the data being published through internet has been increasing day by day. This huge amount of data was named as Big Data by its size. This project deals with the privacy preservation in context of big data usin...

Full description

Bibliographic Details
Main Authors: P. Srinivasa Rao, S. Satyanarayana
Format: Article
Language:English
Published: SpringerOpen 2018-07-01
Series:Journal of Big Data
Subjects:
Online Access:http://link.springer.com/article/10.1186/s40537-018-0130-y
Description
Summary:Abstract Privacy preserving data publication is the main concern in present days, because the data being published through internet has been increasing day by day. This huge amount of data was named as Big Data by its size. This project deals with the privacy preservation in context of big data using a data warehousing solution called hive. We implemented nearest similarity based clustering (NSB) with Bottom-up generalization to achieve (v,l)-anonymity which deals with the sensitivity vulnerabilities and ensures the individual privacy. We also calculate the sensitivity levels by simple comparison method using the index values, by classifying the different levels of sensitivity. The experiments were carried out on the hive environment to verify the efficiency of algorithms with big data. This framework also supports the execution of existing algorithms without any changes. The model in the article outperforms than existing models.
ISSN:2196-1115