PPDCA: Privacy-Preserving Crowdsourcing Data Collection and Analysis With Randomized Response

Randomized response mechanisms for guaranteeing crowdsourcing data privacy have attracted scholarly attention; aggregators can ensure privacy by collecting only randomized data, and individuals can have plausible deniability regarding their responses. With these mechanisms, analysts employed by orga...

Full description

Bibliographic Details
Main Authors: Yao-Tung Tsou, Bo-Cheng Lin
Format: Article
Language:English
Published: IEEE 2018-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8555987/
id doaj-442f16c25f914b359a86a40bb82ca631
record_format Article
spelling doaj-442f16c25f914b359a86a40bb82ca6312021-03-29T21:33:45ZengIEEEIEEE Access2169-35362018-01-016769707698310.1109/ACCESS.2018.28845118555987PPDCA: Privacy-Preserving Crowdsourcing Data Collection and Analysis With Randomized ResponseYao-Tung Tsou0https://orcid.org/0000-0002-7324-5135Bo-Cheng Lin1Department of Communications Engineering, Feng Chia University, Taichung, TaiwanDepartment of Communications Engineering, Feng Chia University, Taichung, TaiwanRandomized response mechanisms for guaranteeing crowdsourcing data privacy have attracted scholarly attention; aggregators can ensure privacy by collecting only randomized data, and individuals can have plausible deniability regarding their responses. With these mechanisms, analysts employed by organizations can still make predictions and conduct analyses using the randomized data. Existing randomized response-based data collection solutions have severely restricted functionality and usability, resulting in impractical and inefficient systems. Therefore, we developed a randomized response-based privacy-preserving crowdsourcing data collection and analysis mechanism. We designed a complementary randomized response (C-RR) method to guarantee individuals' data privacy and to preserve features from the original data for analysis. We formalized a machine learning framework; our proposed method uses randomized data in the form of binary vectors to generate a learning network. Extensive experiments on real-world data sets demonstrated that our heavy-hitters estimation scheme, which applies C-RR and our data learning model, significantly outperformed existing estimation schemes in terms of data analysis.https://ieeexplore.ieee.org/document/8555987/Randomized responselocal differential privacydata analysisrandomized data
collection DOAJ
language English
format Article
sources DOAJ
author Yao-Tung Tsou
Bo-Cheng Lin
spellingShingle Yao-Tung Tsou
Bo-Cheng Lin
PPDCA: Privacy-Preserving Crowdsourcing Data Collection and Analysis With Randomized Response
IEEE Access
Randomized response
local differential privacy
data analysis
randomized data
author_facet Yao-Tung Tsou
Bo-Cheng Lin
author_sort Yao-Tung Tsou
title PPDCA: Privacy-Preserving Crowdsourcing Data Collection and Analysis With Randomized Response
title_short PPDCA: Privacy-Preserving Crowdsourcing Data Collection and Analysis With Randomized Response
title_full PPDCA: Privacy-Preserving Crowdsourcing Data Collection and Analysis With Randomized Response
title_fullStr PPDCA: Privacy-Preserving Crowdsourcing Data Collection and Analysis With Randomized Response
title_full_unstemmed PPDCA: Privacy-Preserving Crowdsourcing Data Collection and Analysis With Randomized Response
title_sort ppdca: privacy-preserving crowdsourcing data collection and analysis with randomized response
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2018-01-01
description Randomized response mechanisms for guaranteeing crowdsourcing data privacy have attracted scholarly attention; aggregators can ensure privacy by collecting only randomized data, and individuals can have plausible deniability regarding their responses. With these mechanisms, analysts employed by organizations can still make predictions and conduct analyses using the randomized data. Existing randomized response-based data collection solutions have severely restricted functionality and usability, resulting in impractical and inefficient systems. Therefore, we developed a randomized response-based privacy-preserving crowdsourcing data collection and analysis mechanism. We designed a complementary randomized response (C-RR) method to guarantee individuals' data privacy and to preserve features from the original data for analysis. We formalized a machine learning framework; our proposed method uses randomized data in the form of binary vectors to generate a learning network. Extensive experiments on real-world data sets demonstrated that our heavy-hitters estimation scheme, which applies C-RR and our data learning model, significantly outperformed existing estimation schemes in terms of data analysis.
topic Randomized response
local differential privacy
data analysis
randomized data
url https://ieeexplore.ieee.org/document/8555987/
work_keys_str_mv AT yaotungtsou ppdcaprivacypreservingcrowdsourcingdatacollectionandanalysiswithrandomizedresponse
AT bochenglin ppdcaprivacypreservingcrowdsourcingdatacollectionandanalysiswithrandomizedresponse
_version_ 1724192628969308160