On Using Linear Diophantine Equations for in-Parallel Hiding of Decision Tree Rules

Data sharing among organizations has become an increasingly common procedure in several areas such as advertising, marketing, electronic commerce, banking, and insurance sectors. However, any organization will most likely try to keep some patterns as hidden as possible once it shares its datasets wi...

Full description

Bibliographic Details
Main Authors: Georgios Feretzakis, Dimitris Kalles, Vassilios S. Verykios
Format: Article
Language:English
Published: MDPI AG 2019-01-01
Series:Entropy
Subjects:
Online Access:http://www.mdpi.com/1099-4300/21/1/66
Description
Summary:Data sharing among organizations has become an increasingly common procedure in several areas such as advertising, marketing, electronic commerce, banking, and insurance sectors. However, any organization will most likely try to keep some patterns as hidden as possible once it shares its datasets with others. This paper focuses on preserving the privacy of sensitive patterns when inducing decision trees. We adopt a record augmentation approach to hide critical classification rules in binary datasets. Such a hiding methodology is preferred over other heuristic solutions like output perturbation or cryptographic techniques, which limit the usability of the data, since the raw data itself is readily available for public use. We propose a look ahead technique using linear Diophantine equations to add the appropriate number of instances while maintaining the initial entropy of the nodes. This method can be used to hide one or more decision tree rules optimally.
ISSN:1099-4300