Ant Colony System Sanitization Approach to Hiding Sensitive Itemsets

In recent years, privacy-preserving data mining (PPDM) has received a lot of attention in the field of data mining research. While some sensitive information in databases cannot be revealed, PPDM can discover additional important knowledge and still hide critical information. There are different way...

Full description

Bibliographic Details
Main Authors: Jimmy Ming-Tai Wu, Justin Zhan, Jerry Chun-Wei Lin
Format: Article
Language:English
Published: IEEE 2017-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/7922531/
id doaj-da15893f8c6c47e1a84b2daa681fa2c9
record_format Article
spelling doaj-da15893f8c6c47e1a84b2daa681fa2c92021-03-29T20:07:17ZengIEEEIEEE Access2169-35362017-01-015100241003910.1109/ACCESS.2017.27022817922531Ant Colony System Sanitization Approach to Hiding Sensitive ItemsetsJimmy Ming-Tai Wu0https://orcid.org/0000-0003-3740-2102Justin Zhan1https://orcid.org/0000-0003-4210-6279Jerry Chun-Wei Lin2https://orcid.org/0000-0001-8768-9709Department of Computer Science, University of Nevada, Las Vegas, NV, USADepartment of Computer Science, University of Nevada, Las Vegas, NV, USASchool of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, ChinaIn recent years, privacy-preserving data mining (PPDM) has received a lot of attention in the field of data mining research. While some sensitive information in databases cannot be revealed, PPDM can discover additional important knowledge and still hide critical information. There are different ways to approach this exhibited in previous research, which applied addition and deletion operations to adjust an original database in order to hide sensitive information. However, it is an NP-hard problem to find an appropriate set of transactions/itemsets for hiding sensitive information. In the past, evolutionary algorithms were developed to hide sensitive itemsets by building an appropriate database. Genetic-based algorithms and a particle swarm optimization-based algorithm, proposed in previous works, not only hide sensitive itemsets but also minimize the side effects of sanitization processes. In this paper, an ant colony system (ACS)-based algorithm called ACS2DT is proposed to decrease side effects and enhance the performance of the sanitization process. Each ant in the population will build a tour for each iteration and each tour indicates the deleted transactions in the original database. The proposed algorithm introduces a useful heuristic function to conduct each ant to select a suitable edge (transaction) for the current situation and also designs several termination conditions to stop the sanitization processes. The proposed heuristic function applies the pre-large concept to monitor side effects and calculates the degree of hiding information to adjust the selecting policy for deleted transactions. The experimental results show that the proposed ACS2DT algorithm performs better than the Greedy algorithm and other two evolutionary algorithms in terms of runtime, fail to be hidden, not to be hidden, not to be generated and database similarity on both real-world and synthetic data sets.https://ieeexplore.ieee.org/document/7922531/Privacy-preserving data miningevolutionary algorithmsensitive itemsetsant colony system
collection DOAJ
language English
format Article
sources DOAJ
author Jimmy Ming-Tai Wu
Justin Zhan
Jerry Chun-Wei Lin
spellingShingle Jimmy Ming-Tai Wu
Justin Zhan
Jerry Chun-Wei Lin
Ant Colony System Sanitization Approach to Hiding Sensitive Itemsets
IEEE Access
Privacy-preserving data mining
evolutionary algorithm
sensitive itemsets
ant colony system
author_facet Jimmy Ming-Tai Wu
Justin Zhan
Jerry Chun-Wei Lin
author_sort Jimmy Ming-Tai Wu
title Ant Colony System Sanitization Approach to Hiding Sensitive Itemsets
title_short Ant Colony System Sanitization Approach to Hiding Sensitive Itemsets
title_full Ant Colony System Sanitization Approach to Hiding Sensitive Itemsets
title_fullStr Ant Colony System Sanitization Approach to Hiding Sensitive Itemsets
title_full_unstemmed Ant Colony System Sanitization Approach to Hiding Sensitive Itemsets
title_sort ant colony system sanitization approach to hiding sensitive itemsets
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2017-01-01
description In recent years, privacy-preserving data mining (PPDM) has received a lot of attention in the field of data mining research. While some sensitive information in databases cannot be revealed, PPDM can discover additional important knowledge and still hide critical information. There are different ways to approach this exhibited in previous research, which applied addition and deletion operations to adjust an original database in order to hide sensitive information. However, it is an NP-hard problem to find an appropriate set of transactions/itemsets for hiding sensitive information. In the past, evolutionary algorithms were developed to hide sensitive itemsets by building an appropriate database. Genetic-based algorithms and a particle swarm optimization-based algorithm, proposed in previous works, not only hide sensitive itemsets but also minimize the side effects of sanitization processes. In this paper, an ant colony system (ACS)-based algorithm called ACS2DT is proposed to decrease side effects and enhance the performance of the sanitization process. Each ant in the population will build a tour for each iteration and each tour indicates the deleted transactions in the original database. The proposed algorithm introduces a useful heuristic function to conduct each ant to select a suitable edge (transaction) for the current situation and also designs several termination conditions to stop the sanitization processes. The proposed heuristic function applies the pre-large concept to monitor side effects and calculates the degree of hiding information to adjust the selecting policy for deleted transactions. The experimental results show that the proposed ACS2DT algorithm performs better than the Greedy algorithm and other two evolutionary algorithms in terms of runtime, fail to be hidden, not to be hidden, not to be generated and database similarity on both real-world and synthetic data sets.
topic Privacy-preserving data mining
evolutionary algorithm
sensitive itemsets
ant colony system
url https://ieeexplore.ieee.org/document/7922531/
work_keys_str_mv AT jimmymingtaiwu antcolonysystemsanitizationapproachtohidingsensitiveitemsets
AT justinzhan antcolonysystemsanitizationapproachtohidingsensitiveitemsets
AT jerrychunweilin antcolonysystemsanitizationapproachtohidingsensitiveitemsets
_version_ 1724195354920878080