Learning With Imbalanced Data in Smart Manufacturing: A Comparative Analysis

The Internet of Things (IoT) paradigm is revolutionising the world of manufacturing into what is known as Smart Manufacturing or Industry 4.0. The main pillar in smart manufacturing looks at harnessing IoT data and leveraging machine learning (ML) to automate the prediction of faults, thus cutting m...

Full description

Bibliographic Details
Main Authors: Yasmin Fathy, Mona Jaber, Alexandra Brintrup
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9309288/
id doaj-3fcd129445424d2189096501bedbcf14
record_format Article
spelling doaj-3fcd129445424d2189096501bedbcf142021-03-30T14:58:01ZengIEEEIEEE Access2169-35362021-01-0192734275710.1109/ACCESS.2020.30478389309288Learning With Imbalanced Data in Smart Manufacturing: A Comparative AnalysisYasmin Fathy0https://orcid.org/0000-0001-7398-5283Mona Jaber1https://orcid.org/0000-0002-0908-3207Alexandra Brintrup2https://orcid.org/0000-0002-4189-2434Department of Engineering, University of Cambridge, Cambridge, U.K.School of Electronic Engineering and Computer Science, Queen Mary University of London, London, U.K.Department of Engineering, University of Cambridge, Cambridge, U.K.The Internet of Things (IoT) paradigm is revolutionising the world of manufacturing into what is known as Smart Manufacturing or Industry 4.0. The main pillar in smart manufacturing looks at harnessing IoT data and leveraging machine learning (ML) to automate the prediction of faults, thus cutting maintenance time and cost and improving the product quality. However, faults in real industries are overwhelmingly outweighed by instances of good performance (faultless samples); this bias is reflected in the data captured by IoT devices. Imbalanced data limits the success of ML in predicting faults, thus presents a significant hindrance in the progress of smart manufacturing. Although various techniques have been proposed to tackle this challenge in general, this work is the first to present a framework for evaluating the effectiveness of these remedies in the context of manufacturing. We present a comprehensive comparative analysis in which we apply our proposed framework to benchmark the performance of different combinations of algorithm components using a real-world manufacturing dataset. We draw key insights into the effectiveness of each component and inter-relatedness between the dataset, the application context, and the design of the ML algorithm.https://ieeexplore.ieee.org/document/9309288/Manufacturing analyticsgenerative modelingsmart manufacturingimbalanced datalimited failure datagenerating synthetic data
collection DOAJ
language English
format Article
sources DOAJ
author Yasmin Fathy
Mona Jaber
Alexandra Brintrup
spellingShingle Yasmin Fathy
Mona Jaber
Alexandra Brintrup
Learning With Imbalanced Data in Smart Manufacturing: A Comparative Analysis
IEEE Access
Manufacturing analytics
generative modeling
smart manufacturing
imbalanced data
limited failure data
generating synthetic data
author_facet Yasmin Fathy
Mona Jaber
Alexandra Brintrup
author_sort Yasmin Fathy
title Learning With Imbalanced Data in Smart Manufacturing: A Comparative Analysis
title_short Learning With Imbalanced Data in Smart Manufacturing: A Comparative Analysis
title_full Learning With Imbalanced Data in Smart Manufacturing: A Comparative Analysis
title_fullStr Learning With Imbalanced Data in Smart Manufacturing: A Comparative Analysis
title_full_unstemmed Learning With Imbalanced Data in Smart Manufacturing: A Comparative Analysis
title_sort learning with imbalanced data in smart manufacturing: a comparative analysis
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2021-01-01
description The Internet of Things (IoT) paradigm is revolutionising the world of manufacturing into what is known as Smart Manufacturing or Industry 4.0. The main pillar in smart manufacturing looks at harnessing IoT data and leveraging machine learning (ML) to automate the prediction of faults, thus cutting maintenance time and cost and improving the product quality. However, faults in real industries are overwhelmingly outweighed by instances of good performance (faultless samples); this bias is reflected in the data captured by IoT devices. Imbalanced data limits the success of ML in predicting faults, thus presents a significant hindrance in the progress of smart manufacturing. Although various techniques have been proposed to tackle this challenge in general, this work is the first to present a framework for evaluating the effectiveness of these remedies in the context of manufacturing. We present a comprehensive comparative analysis in which we apply our proposed framework to benchmark the performance of different combinations of algorithm components using a real-world manufacturing dataset. We draw key insights into the effectiveness of each component and inter-relatedness between the dataset, the application context, and the design of the ML algorithm.
topic Manufacturing analytics
generative modeling
smart manufacturing
imbalanced data
limited failure data
generating synthetic data
url https://ieeexplore.ieee.org/document/9309288/
work_keys_str_mv AT yasminfathy learningwithimbalanceddatainsmartmanufacturingacomparativeanalysis
AT monajaber learningwithimbalanceddatainsmartmanufacturingacomparativeanalysis
AT alexandrabrintrup learningwithimbalanceddatainsmartmanufacturingacomparativeanalysis
_version_ 1724180202746019840