Outlier Detection for Monitoring Data Using Stacked Autoencoder

Monitoring data contain the important status information of the monitored object, and are the basis for following data mining and analysis. However, the monitoring data usually suffer the pollution of the outliers, leading to negative effect on the subsequent data processing. To address the problem,...

Full description

Bibliographic Details
Main Authors: Fangyi Wan, Gaodeng Guo, Chunlin Zhang, Qing Guo, Jie Liu
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8917639/
id doaj-5b34d714eb0544f393983773c7438806
record_format Article
spelling doaj-5b34d714eb0544f393983773c74388062021-03-30T00:48:12ZengIEEEIEEE Access2169-35362019-01-01717382717383710.1109/ACCESS.2019.29564948917639Outlier Detection for Monitoring Data Using Stacked AutoencoderFangyi Wan0https://orcid.org/0000-0002-4404-5820Gaodeng Guo1https://orcid.org/0000-0003-2774-6876Chunlin Zhang2https://orcid.org/0000-0002-2488-0482Qing Guo3https://orcid.org/0000-0002-1220-3804Jie Liu4https://orcid.org/0000-0001-7493-0882School of Aeronautics, Northwestern Polytechnical University, Xi’an, ChinaSchool of Aeronautics, Northwestern Polytechnical University, Xi’an, ChinaSchool of Aeronautics, Northwestern Polytechnical University, Xi’an, ChinaSchool of Aeronautics, Northwestern Polytechnical University, Xi’an, ChinaSchool of Aeronautics, Northwestern Polytechnical University, Xi’an, ChinaMonitoring data contain the important status information of the monitored object, and are the basis for following data mining and analysis. However, the monitoring data usually suffer the pollution of the outliers, leading to negative effect on the subsequent data processing. To address the problem, this paper proposed an outlier detection method based on stacked autoencoder (SAE). SAE has a powerful capability of feature extraction and greatly preserves the original information of the data. The trained SAE by normal data can learn the characteristics of normal data. When a set of data with outliers are inputted to the trained network, there are larger reconstruction errors at the outliers between the original input data and the reconstructed data obtained by using the encoding parameters and the decoding parameter mapping, which provides a basis for locating outliers. Meanwhile, this paper introduced the Grubbs criterion and the PauTa criterion to identify the reconstruction errors corresponding to the outliers based on the traditional threshold method. The method can quickly isolate the abnormal data from the normal data according to the reconstruction error and the identification criterion. The effectiveness and superiority of the proposed method have been validated by experiment on real data and comparisons with traditional outlier detection algorithms.https://ieeexplore.ieee.org/document/8917639/Condition monitoringoutlier detectionstacked autoencodermonitoring data
collection DOAJ
language English
format Article
sources DOAJ
author Fangyi Wan
Gaodeng Guo
Chunlin Zhang
Qing Guo
Jie Liu
spellingShingle Fangyi Wan
Gaodeng Guo
Chunlin Zhang
Qing Guo
Jie Liu
Outlier Detection for Monitoring Data Using Stacked Autoencoder
IEEE Access
Condition monitoring
outlier detection
stacked autoencoder
monitoring data
author_facet Fangyi Wan
Gaodeng Guo
Chunlin Zhang
Qing Guo
Jie Liu
author_sort Fangyi Wan
title Outlier Detection for Monitoring Data Using Stacked Autoencoder
title_short Outlier Detection for Monitoring Data Using Stacked Autoencoder
title_full Outlier Detection for Monitoring Data Using Stacked Autoencoder
title_fullStr Outlier Detection for Monitoring Data Using Stacked Autoencoder
title_full_unstemmed Outlier Detection for Monitoring Data Using Stacked Autoencoder
title_sort outlier detection for monitoring data using stacked autoencoder
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2019-01-01
description Monitoring data contain the important status information of the monitored object, and are the basis for following data mining and analysis. However, the monitoring data usually suffer the pollution of the outliers, leading to negative effect on the subsequent data processing. To address the problem, this paper proposed an outlier detection method based on stacked autoencoder (SAE). SAE has a powerful capability of feature extraction and greatly preserves the original information of the data. The trained SAE by normal data can learn the characteristics of normal data. When a set of data with outliers are inputted to the trained network, there are larger reconstruction errors at the outliers between the original input data and the reconstructed data obtained by using the encoding parameters and the decoding parameter mapping, which provides a basis for locating outliers. Meanwhile, this paper introduced the Grubbs criterion and the PauTa criterion to identify the reconstruction errors corresponding to the outliers based on the traditional threshold method. The method can quickly isolate the abnormal data from the normal data according to the reconstruction error and the identification criterion. The effectiveness and superiority of the proposed method have been validated by experiment on real data and comparisons with traditional outlier detection algorithms.
topic Condition monitoring
outlier detection
stacked autoencoder
monitoring data
url https://ieeexplore.ieee.org/document/8917639/
work_keys_str_mv AT fangyiwan outlierdetectionformonitoringdatausingstackedautoencoder
AT gaodengguo outlierdetectionformonitoringdatausingstackedautoencoder
AT chunlinzhang outlierdetectionformonitoringdatausingstackedautoencoder
AT qingguo outlierdetectionformonitoringdatausingstackedautoencoder
AT jieliu outlierdetectionformonitoringdatausingstackedautoencoder
_version_ 1724187804477423616