Text Mining of Hazard and Operability Analysis Reports Based on Active Learning

In the field of chemical safety, a named entity recognition (NER) model based on deep learning can mine valuable information from hazard and operability analysis (HAZOP) text, which can guide experts to carry out a new round of HAZOP analysis, help practitioners optimize the hidden dangers in the sy...

Full description

Bibliographic Details
Main Authors:	Zhenhua Wang, Beike Zhang, Dong Gao
Format:	Article
Language:	English
Published:	MDPI AG 2021-07-01
Series:	Processes
Subjects:	active learning sampling algorithm hazard and operability analysis deep learning named entity recognition
Online Access:	https://www.mdpi.com/2227-9717/9/7/1178

id	doaj-a0880bf5cd9843d1882e99238feb35c5
record_format	Article
spelling	doaj-a0880bf5cd9843d1882e99238feb35c52021-07-23T14:03:14ZengMDPI AGProcesses2227-97172021-07-0191178117810.3390/pr9071178Text Mining of Hazard and Operability Analysis Reports Based on Active LearningZhenhua Wang0Beike Zhang1Dong Gao2College of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, ChinaCollege of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, ChinaCollege of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, ChinaIn the field of chemical safety, a named entity recognition (NER) model based on deep learning can mine valuable information from hazard and operability analysis (HAZOP) text, which can guide experts to carry out a new round of HAZOP analysis, help practitioners optimize the hidden dangers in the system, and be of great significance to improve the safety of the whole chemical system. However, due to the standardization and professionalism of chemical safety analysis text, it is difficult to improve the performance of traditional models. To solve this problem, in this study, an improved method based on active learning is proposed, and three novel sampling algorithms are designed, Variation of Token Entropy (VTE), HAZOP Confusion Entropy (HCE) and Amplification of Least Confidence (ALC), which improve the ability of the model to understand HAZOP text. In this method, a part of data is used to establish the initial model. The sampling algorithm is then used to select high-quality samples from the data set. Finally, these high-quality samples are used to retrain the whole model to obtain the final model. The experimental results show that the performance of the VTE, HCE, and ALC algorithms are better than that of random sampling algorithms. In addition, compared with other methods, the performance of the traditional model is improved effectively by the method proposed in this paper, which proves that the method is reliable and advanced.https://www.mdpi.com/2227-9717/9/7/1178active learningsampling algorithmhazard and operability analysisdeep learningnamed entity recognition
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Zhenhua Wang Beike Zhang Dong Gao
spellingShingle	Zhenhua Wang Beike Zhang Dong Gao Text Mining of Hazard and Operability Analysis Reports Based on Active Learning Processes active learning sampling algorithm hazard and operability analysis deep learning named entity recognition
author_facet	Zhenhua Wang Beike Zhang Dong Gao
author_sort	Zhenhua Wang
title	Text Mining of Hazard and Operability Analysis Reports Based on Active Learning
title_short	Text Mining of Hazard and Operability Analysis Reports Based on Active Learning
title_full	Text Mining of Hazard and Operability Analysis Reports Based on Active Learning
title_fullStr	Text Mining of Hazard and Operability Analysis Reports Based on Active Learning
title_full_unstemmed	Text Mining of Hazard and Operability Analysis Reports Based on Active Learning
title_sort	text mining of hazard and operability analysis reports based on active learning
publisher	MDPI AG
series	Processes
issn	2227-9717
publishDate	2021-07-01
description	In the field of chemical safety, a named entity recognition (NER) model based on deep learning can mine valuable information from hazard and operability analysis (HAZOP) text, which can guide experts to carry out a new round of HAZOP analysis, help practitioners optimize the hidden dangers in the system, and be of great significance to improve the safety of the whole chemical system. However, due to the standardization and professionalism of chemical safety analysis text, it is difficult to improve the performance of traditional models. To solve this problem, in this study, an improved method based on active learning is proposed, and three novel sampling algorithms are designed, Variation of Token Entropy (VTE), HAZOP Confusion Entropy (HCE) and Amplification of Least Confidence (ALC), which improve the ability of the model to understand HAZOP text. In this method, a part of data is used to establish the initial model. The sampling algorithm is then used to select high-quality samples from the data set. Finally, these high-quality samples are used to retrain the whole model to obtain the final model. The experimental results show that the performance of the VTE, HCE, and ALC algorithms are better than that of random sampling algorithms. In addition, compared with other methods, the performance of the traditional model is improved effectively by the method proposed in this paper, which proves that the method is reliable and advanced.
topic	active learning sampling algorithm hazard and operability analysis deep learning named entity recognition
url	https://www.mdpi.com/2227-9717/9/7/1178
work_keys_str_mv	AT zhenhuawang textminingofhazardandoperabilityanalysisreportsbasedonactivelearning AT beikezhang textminingofhazardandoperabilityanalysisreportsbasedonactivelearning AT donggao textminingofhazardandoperabilityanalysisreportsbasedonactivelearning
_version_	1721286230393487360

Text Mining of Hazard and Operability Analysis Reports Based on Active Learning

Similar Items