Multi-Party Privacy-Preserving Logistic Regression with Poor Quality Data Filtering for IoT Contributors

Nowadays, the internet of things (IoT) is used to generate data in several application domains. A logistic regression, which is a standard machine learning algorithm with a wide application range, is built on such data. Nevertheless, building a powerful and effective logistic regression model requir...

Full description

Bibliographic Details
Main Authors: Kennedy Edemacu, Jong Wook Kim
Format: Article
Language:English
Published: MDPI AG 2021-08-01
Series:Electronics
Subjects:
IoT
Online Access:https://www.mdpi.com/2079-9292/10/17/2049
id doaj-6788b9ba704c4e969e5e191bb18f1e16
record_format Article
spelling doaj-6788b9ba704c4e969e5e191bb18f1e162021-09-09T13:41:51ZengMDPI AGElectronics2079-92922021-08-01102049204910.3390/electronics10172049Multi-Party Privacy-Preserving Logistic Regression with Poor Quality Data Filtering for IoT ContributorsKennedy Edemacu0Jong Wook Kim1Department of Computer Science, Sangmyung University, Seoul 03016, KoreaDepartment of Computer Science, Sangmyung University, Seoul 03016, KoreaNowadays, the internet of things (IoT) is used to generate data in several application domains. A logistic regression, which is a standard machine learning algorithm with a wide application range, is built on such data. Nevertheless, building a powerful and effective logistic regression model requires large amounts of data. Thus, collaboration between multiple IoT participants has often been the go-to approach. However, privacy concerns and poor data quality are two challenges that threaten the success of such a setting. Several studies have proposed different methods to address the privacy concern but to the best of our knowledge, little attention has been paid towards addressing the poor data quality problems in the multi-party logistic regression model. Thus, in this study, we propose a multi-party privacy-preserving logistic regression framework with poor quality data filtering for IoT data contributors to address both problems. Specifically, we propose a new metric <i>gradient similarity</i> in a distributed setting that we employ to filter out parameters from data contributors with poor quality data. To solve the privacy challenge, we employ homomorphic encryption. Theoretical analysis and experimental evaluations using real-world datasets demonstrate that our proposed framework is privacy-preserving and robust against poor quality data.https://www.mdpi.com/2079-9292/10/17/2049IoTlogistic regressionhomomorphic encryptionmulti-partygradient similaritydata quality
collection DOAJ
language English
format Article
sources DOAJ
author Kennedy Edemacu
Jong Wook Kim
spellingShingle Kennedy Edemacu
Jong Wook Kim
Multi-Party Privacy-Preserving Logistic Regression with Poor Quality Data Filtering for IoT Contributors
Electronics
IoT
logistic regression
homomorphic encryption
multi-party
gradient similarity
data quality
author_facet Kennedy Edemacu
Jong Wook Kim
author_sort Kennedy Edemacu
title Multi-Party Privacy-Preserving Logistic Regression with Poor Quality Data Filtering for IoT Contributors
title_short Multi-Party Privacy-Preserving Logistic Regression with Poor Quality Data Filtering for IoT Contributors
title_full Multi-Party Privacy-Preserving Logistic Regression with Poor Quality Data Filtering for IoT Contributors
title_fullStr Multi-Party Privacy-Preserving Logistic Regression with Poor Quality Data Filtering for IoT Contributors
title_full_unstemmed Multi-Party Privacy-Preserving Logistic Regression with Poor Quality Data Filtering for IoT Contributors
title_sort multi-party privacy-preserving logistic regression with poor quality data filtering for iot contributors
publisher MDPI AG
series Electronics
issn 2079-9292
publishDate 2021-08-01
description Nowadays, the internet of things (IoT) is used to generate data in several application domains. A logistic regression, which is a standard machine learning algorithm with a wide application range, is built on such data. Nevertheless, building a powerful and effective logistic regression model requires large amounts of data. Thus, collaboration between multiple IoT participants has often been the go-to approach. However, privacy concerns and poor data quality are two challenges that threaten the success of such a setting. Several studies have proposed different methods to address the privacy concern but to the best of our knowledge, little attention has been paid towards addressing the poor data quality problems in the multi-party logistic regression model. Thus, in this study, we propose a multi-party privacy-preserving logistic regression framework with poor quality data filtering for IoT data contributors to address both problems. Specifically, we propose a new metric <i>gradient similarity</i> in a distributed setting that we employ to filter out parameters from data contributors with poor quality data. To solve the privacy challenge, we employ homomorphic encryption. Theoretical analysis and experimental evaluations using real-world datasets demonstrate that our proposed framework is privacy-preserving and robust against poor quality data.
topic IoT
logistic regression
homomorphic encryption
multi-party
gradient similarity
data quality
url https://www.mdpi.com/2079-9292/10/17/2049
work_keys_str_mv AT kennedyedemacu multipartyprivacypreservinglogisticregressionwithpoorqualitydatafilteringforiotcontributors
AT jongwookkim multipartyprivacypreservinglogisticregressionwithpoorqualitydatafilteringforiotcontributors
_version_ 1717760627725828096