A Data-Based Framework for Identifying a Source Location of a Contaminant Spill in a River System with Random Measurement Errors

This study addresses the problem of identifying the source location of a contaminant spill in a river system when a sensor network returns observations containing random measurement errors. To solve this problem, we suggest a new framework comprising three main steps: (i) spill detection, (ii) data...

Full description

Bibliographic Details
Main Authors: Jun Hyeong Kim, Mi Lim Lee, Chuljin Park
Format: Article
Language:English
Published: MDPI AG 2019-08-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/19/15/3378
id doaj-91b7ad250e1843a6a1f281e15dd3f371
record_format Article
spelling doaj-91b7ad250e1843a6a1f281e15dd3f3712020-11-24T21:38:51ZengMDPI AGSensors1424-82202019-08-011915337810.3390/s19153378s19153378A Data-Based Framework for Identifying a Source Location of a Contaminant Spill in a River System with Random Measurement ErrorsJun Hyeong Kim0Mi Lim Lee1Chuljin Park2Department of Industrial Engineering, Hanyang University, 222 Wangsimni-Ro, Seongdong gu, Seoul 04763, KoreaCollege of Business Administration, Hongik University, 94, Wausan-ro, Mapo-gu, Seoul 04066, KoreaDepartment of Industrial Engineering, Hanyang University, 222 Wangsimni-Ro, Seongdong gu, Seoul 04763, KoreaThis study addresses the problem of identifying the source location of a contaminant spill in a river system when a sensor network returns observations containing random measurement errors. To solve this problem, we suggest a new framework comprising three main steps: (i) spill detection, (ii) data preprocessing, and (iii) source identification. Specifically, we applied a statistical process control chart to detect a contaminant spill with measurement errors while keeping the false alarm rate at less than or equal to a user-specified value. After detecting a spill, we generated a nonlinear regression model to estimate a breakthrough curve of the observations and derive a characteristic vector of the estimated curve. Using the characteristic vector as an input, a random forest model was constructed with the sensor raising the first alarm. The model provides output values between 0 and 1 to represent the possibility of each candidate location being the true spill source. These possibility values allow users to identify strong candidate locations for the spill. The accuracy of our framework was tested on part of the Altamaha River system in Georgia, USA.https://www.mdpi.com/1424-8220/19/15/3378source identificationsensor networkwater quality monitoringriver systemstatistical process controlrandom forest
collection DOAJ
language English
format Article
sources DOAJ
author Jun Hyeong Kim
Mi Lim Lee
Chuljin Park
spellingShingle Jun Hyeong Kim
Mi Lim Lee
Chuljin Park
A Data-Based Framework for Identifying a Source Location of a Contaminant Spill in a River System with Random Measurement Errors
Sensors
source identification
sensor network
water quality monitoring
river system
statistical process control
random forest
author_facet Jun Hyeong Kim
Mi Lim Lee
Chuljin Park
author_sort Jun Hyeong Kim
title A Data-Based Framework for Identifying a Source Location of a Contaminant Spill in a River System with Random Measurement Errors
title_short A Data-Based Framework for Identifying a Source Location of a Contaminant Spill in a River System with Random Measurement Errors
title_full A Data-Based Framework for Identifying a Source Location of a Contaminant Spill in a River System with Random Measurement Errors
title_fullStr A Data-Based Framework for Identifying a Source Location of a Contaminant Spill in a River System with Random Measurement Errors
title_full_unstemmed A Data-Based Framework for Identifying a Source Location of a Contaminant Spill in a River System with Random Measurement Errors
title_sort data-based framework for identifying a source location of a contaminant spill in a river system with random measurement errors
publisher MDPI AG
series Sensors
issn 1424-8220
publishDate 2019-08-01
description This study addresses the problem of identifying the source location of a contaminant spill in a river system when a sensor network returns observations containing random measurement errors. To solve this problem, we suggest a new framework comprising three main steps: (i) spill detection, (ii) data preprocessing, and (iii) source identification. Specifically, we applied a statistical process control chart to detect a contaminant spill with measurement errors while keeping the false alarm rate at less than or equal to a user-specified value. After detecting a spill, we generated a nonlinear regression model to estimate a breakthrough curve of the observations and derive a characteristic vector of the estimated curve. Using the characteristic vector as an input, a random forest model was constructed with the sensor raising the first alarm. The model provides output values between 0 and 1 to represent the possibility of each candidate location being the true spill source. These possibility values allow users to identify strong candidate locations for the spill. The accuracy of our framework was tested on part of the Altamaha River system in Georgia, USA.
topic source identification
sensor network
water quality monitoring
river system
statistical process control
random forest
url https://www.mdpi.com/1424-8220/19/15/3378
work_keys_str_mv AT junhyeongkim adatabasedframeworkforidentifyingasourcelocationofacontaminantspillinariversystemwithrandommeasurementerrors
AT milimlee adatabasedframeworkforidentifyingasourcelocationofacontaminantspillinariversystemwithrandommeasurementerrors
AT chuljinpark adatabasedframeworkforidentifyingasourcelocationofacontaminantspillinariversystemwithrandommeasurementerrors
AT junhyeongkim databasedframeworkforidentifyingasourcelocationofacontaminantspillinariversystemwithrandommeasurementerrors
AT milimlee databasedframeworkforidentifyingasourcelocationofacontaminantspillinariversystemwithrandommeasurementerrors
AT chuljinpark databasedframeworkforidentifyingasourcelocationofacontaminantspillinariversystemwithrandommeasurementerrors
_version_ 1725934195218317312