A Time Series Data Filling Method Based on LSTM—Taking the Stem Moisture as an Example
In order to solve the problem of data loss in sensor data collection, this paper took the stem moisture data of plants as the object, and compared the filling value of missing data in the same data segment with different data filling methods to verify the validity and accuracy of the stem water fill...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-09-01
|
Series: | Sensors |
Subjects: | |
Online Access: | https://www.mdpi.com/1424-8220/20/18/5045 |
id |
doaj-d4449c9b4bde420abf7419690d042857 |
---|---|
record_format |
Article |
spelling |
doaj-d4449c9b4bde420abf7419690d0428572020-11-25T03:25:16ZengMDPI AGSensors1424-82202020-09-01205045504510.3390/s20185045A Time Series Data Filling Method Based on LSTM—Taking the Stem Moisture as an ExampleWei Song0Chao Gao1Yue Zhao2Yandong Zhao3School of Technology, Beijing Forestry University, Beijing 100083, ChinaSchool of Computer and Information Engineering, Beijing Technology and Business University, Beijing 100048, ChinaSchool of Technology, Beijing Forestry University, Beijing 100083, ChinaSchool of Technology, Beijing Forestry University, Beijing 100083, ChinaIn order to solve the problem of data loss in sensor data collection, this paper took the stem moisture data of plants as the object, and compared the filling value of missing data in the same data segment with different data filling methods to verify the validity and accuracy of the stem water filling data of the LSTM (Long Short-Term Memory) model. This paper compared the accuracy of missing stem water data for plants under different data filling methods to solve the problem of data loss in sensor data collection. Original stem moisture data was selected from <i>Lagerstroemia Indica</i> which was planted in the Haidian District of Beijing in June 2017. Part of the data which treated as missing data was manually deleted. Interpolation methods, time series statistical methods, the RNN (Recurrent Neural Network), and LSTM neural network were used to fill in the missing part and the filling results were compared with the original data. The result shows that the LSTM has more accurate performance than the RNN. The error values of the bidirectional LSTM model are the smallest among several models. The error values of the bidirectional LSTM are much lower than other methods. The MAPE (mean absolute percent error) of the bidirectional LSTM model is 1.813%. After increasing the length of the training data, the results further proved the effectiveness of the model. Further, in order to solve the problem of one-dimensional filling error accumulation, the LSTM model is used to conduct the multi-dimensional filling experiment with environmental data. After comparing the filling results of different environmental parameters, three environmental parameters of air humidity, photosynthetic active radiation, and soil temperature were selected as input. The results show that the multi-dimensional filling can greatly extend the sequence length while maintaining the accuracy, and make up for the defect that the one-dimensional filling accumulates errors with the increase of the sequence. The minimum MAPE of multidimensional filling is 1.499%. In conclusion, the data filling method based on LSTM neural network has a great advantage in filling the long-lost time series data which would provide a new idea for data filling.https://www.mdpi.com/1424-8220/20/18/5045data fillingLSTM neural networkmissing datastem moisture |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Wei Song Chao Gao Yue Zhao Yandong Zhao |
spellingShingle |
Wei Song Chao Gao Yue Zhao Yandong Zhao A Time Series Data Filling Method Based on LSTM—Taking the Stem Moisture as an Example Sensors data filling LSTM neural network missing data stem moisture |
author_facet |
Wei Song Chao Gao Yue Zhao Yandong Zhao |
author_sort |
Wei Song |
title |
A Time Series Data Filling Method Based on LSTM—Taking the Stem Moisture as an Example |
title_short |
A Time Series Data Filling Method Based on LSTM—Taking the Stem Moisture as an Example |
title_full |
A Time Series Data Filling Method Based on LSTM—Taking the Stem Moisture as an Example |
title_fullStr |
A Time Series Data Filling Method Based on LSTM—Taking the Stem Moisture as an Example |
title_full_unstemmed |
A Time Series Data Filling Method Based on LSTM—Taking the Stem Moisture as an Example |
title_sort |
time series data filling method based on lstm—taking the stem moisture as an example |
publisher |
MDPI AG |
series |
Sensors |
issn |
1424-8220 |
publishDate |
2020-09-01 |
description |
In order to solve the problem of data loss in sensor data collection, this paper took the stem moisture data of plants as the object, and compared the filling value of missing data in the same data segment with different data filling methods to verify the validity and accuracy of the stem water filling data of the LSTM (Long Short-Term Memory) model. This paper compared the accuracy of missing stem water data for plants under different data filling methods to solve the problem of data loss in sensor data collection. Original stem moisture data was selected from <i>Lagerstroemia Indica</i> which was planted in the Haidian District of Beijing in June 2017. Part of the data which treated as missing data was manually deleted. Interpolation methods, time series statistical methods, the RNN (Recurrent Neural Network), and LSTM neural network were used to fill in the missing part and the filling results were compared with the original data. The result shows that the LSTM has more accurate performance than the RNN. The error values of the bidirectional LSTM model are the smallest among several models. The error values of the bidirectional LSTM are much lower than other methods. The MAPE (mean absolute percent error) of the bidirectional LSTM model is 1.813%. After increasing the length of the training data, the results further proved the effectiveness of the model. Further, in order to solve the problem of one-dimensional filling error accumulation, the LSTM model is used to conduct the multi-dimensional filling experiment with environmental data. After comparing the filling results of different environmental parameters, three environmental parameters of air humidity, photosynthetic active radiation, and soil temperature were selected as input. The results show that the multi-dimensional filling can greatly extend the sequence length while maintaining the accuracy, and make up for the defect that the one-dimensional filling accumulates errors with the increase of the sequence. The minimum MAPE of multidimensional filling is 1.499%. In conclusion, the data filling method based on LSTM neural network has a great advantage in filling the long-lost time series data which would provide a new idea for data filling. |
topic |
data filling LSTM neural network missing data stem moisture |
url |
https://www.mdpi.com/1424-8220/20/18/5045 |
work_keys_str_mv |
AT weisong atimeseriesdatafillingmethodbasedonlstmtakingthestemmoistureasanexample AT chaogao atimeseriesdatafillingmethodbasedonlstmtakingthestemmoistureasanexample AT yuezhao atimeseriesdatafillingmethodbasedonlstmtakingthestemmoistureasanexample AT yandongzhao atimeseriesdatafillingmethodbasedonlstmtakingthestemmoistureasanexample AT weisong timeseriesdatafillingmethodbasedonlstmtakingthestemmoistureasanexample AT chaogao timeseriesdatafillingmethodbasedonlstmtakingthestemmoistureasanexample AT yuezhao timeseriesdatafillingmethodbasedonlstmtakingthestemmoistureasanexample AT yandongzhao timeseriesdatafillingmethodbasedonlstmtakingthestemmoistureasanexample |
_version_ |
1724597924448436224 |