Privacy Preserving Data Publishing for Multiple Sensitive Attributes Based on Security Level

Privacy preserving data publishing has received considerable attention for publishing useful information while preserving data privacy. The existing privacy preserving data publishing methods for multiple sensitive attributes do not consider the situation that different values of a sensitive attribu...

Full description

Bibliographic Details
Main Authors: Yuelei Xiao, Haiqi Li
Format: Article
Language:English
Published: MDPI AG 2020-03-01
Series:Information
Subjects:
Online Access:https://www.mdpi.com/2078-2489/11/3/166
Description
Summary:Privacy preserving data publishing has received considerable attention for publishing useful information while preserving data privacy. The existing privacy preserving data publishing methods for multiple sensitive attributes do not consider the situation that different values of a sensitive attribute may have different sensitivity requirements. To solve this problem, we defined three security levels for different sensitive attribute values that have different sensitivity requirements, and given an <inline-formula> <math display="inline"> <semantics> <mrow> <msub> <mi>L</mi> <mrow> <mi>s</mi> <mi>l</mi> </mrow> </msub> </mrow> </semantics> </math> </inline-formula>-diversity model for multiple sensitive attributes. Following this, we proposed three specific greed algorithms based on the maximal-bucket first (MBF), maximal single-dimension-capacity first (MSDCF) and maximal multi-dimension-capacity first (MMDCF) algorithms and the maximal security-level first (MSLF) greed policy, named as MBF based on MSLF (MBF-MSLF), MSDCF based on MSLF (MSDCF-MSLF) and MMDCF based on MSLF (MMDCF-MSLF), to implement the <inline-formula> <math display="inline"> <semantics> <mrow> <msub> <mi>L</mi> <mrow> <mi>s</mi> <mi>l</mi> </mrow> </msub> </mrow> </semantics> </math> </inline-formula>-diversity model for multiple sensitive attributes. The experimental results show that the three algorithms can greatly reduce the information loss of the published microdata, but their runtime is only a small increase, and their information loss tends to be stable with the increasing of data volume. And they can solve the problem that the information loss of MBF, MSDCF and MMDCF increases greatly with the increasing of sensitive attribute number.
ISSN:2078-2489